diff --git a/.changeset/little-walls-sneeze.md b/.changeset/little-walls-sneeze.md new file mode 100644 index 000000000..b9f014ded --- /dev/null +++ b/.changeset/little-walls-sneeze.md @@ -0,0 +1,5 @@ +--- +'@openfn/cli': minor +--- + +fetch: allow state files to be writtem to JSON with --format diff --git a/claude.md b/claude.md new file mode 100644 index 000000000..b2f45c047 --- /dev/null +++ b/claude.md @@ -0,0 +1,139 @@ +# OpenFn Kit + +This monorepo contains the core packages that power OpenFn's workflow automation platform. OpenFn is a Digital Public Good trusted by NGOs and governments in 40+ countries to automate data integration workflows. + +## Architecture + +The repository has three main packages: **CLI**, **Runtime**, and **Worker**. The CLI and Worker are both frontends for executing workflows - the CLI for local development, the Worker for production execution via Lightning (the web platform). Both wrap the Runtime as their execution engine. The Worker uses engine-multi to wrap the Runtime for multi-process execution. + +## Core Packages + +- **[@openfn/cli](packages/cli)** - Command-line interface for local development. Run, test, compile, and deploy workflows. +- **[@openfn/runtime](packages/runtime)** - Core execution engine. Safely executes jobs in a sandboxed VM environment. +- **[@openfn/ws-worker](packages/ws-worker)** - WebSocket worker connecting Lightning to the Runtime. Stateless server that pulls runs from Lightning's queue. See [.claude/event-processor.md](.claude/event-processor.md) for event processing details. +- **[@openfn/engine-multi](packages/engine-multi)** - Multi-process runtime wrapper used by ws-worker for concurrent workflow execution. +- **[@openfn/compiler](packages/compiler)** - Transforms OpenFn job DSL into executable JavaScript modules. + +## Supporting Packages + +- **@openfn/lexicon** - Shared TypeScript types +- **@openfn/logger** - Structured logging utilities +- **@openfn/describe-package** - TypeScript analysis for adaptor docs (to be phased out) +- **@openfn/deploy** - Deployment logic for Lightning (soon to be deprecated) +- **@openfn/project** - Models and understands local OpenFn projects +- **@openfn/lightning-mock** - Mock Lightning server for testing + +## AI Assistant + +- Keep responses terse and do not over-explain. Users will ask for more guidance if they need it. +- Always present users a short action plan and ask for confirmation before doing it +- Keep the human in the loop at all times. Stop regularly and check for guidance. + +## Key Concepts + +**Workflows** are sequences of **jobs** that process data through steps. Each **job** is an array of **operations** (functions that transform state). State flows between jobs based on conditional edges. + +**Adaptors** are npm packages (e.g., `@openfn/language-http`) providing operations for specific systems. The CLI auto-installs them as needed. + +The **Compiler** transforms job DSL code into standard ES modules with imports and operation arrays. + +## Development Setup + +### Prerequisites + +- Node.js 18+ (use `asdf`) +- pnpm (enable with `corepack enable`) + +### Common Commands + +```bash +# Root +pnpm install # Install dependencies +pnpm build # Build all packages +pnpm test # Run all tests +pnpm changeset # Add a changeset for your PR + +# CLI +cd packages/cli +pnpm openfn test # Run from source +pnpm install:global # Install as 'openfnx' for testing + +# Worker +cd packages/ws-worker +pnpm start # Connect to localhost:4000 +pnpm start -l mock # Use mock Lightning +pnpm start --no-loop # Disable auto-fetch +curl -X POST http://localhost:2222/claim # Manual claim +``` + +### Environment Variables + +- `OPENFN_REPO_DIR` - CLI adaptor storage +- `OPENFN_ADAPTORS_REPO` - Local adaptors monorepo path +- `OPENFN_API_KEY` - API key for Lightning deployment +- `OPENFN_ENDPOINT` - Lightning URL (default: app.openfn.org) +- `WORKER_SECRET` - Worker authentication secret + +## Repository Structure + +``` +packages/ +├── cli/ # CLI entry: cli.ts, commands.ts, projects/, options.ts +├── runtime/ # Runtime entry: index.ts, runtime.ts, util/linker +├── ws-worker/ # Worker entry: start.ts, server.ts, api/, events/ +├── compiler/ # Job DSL compiler +├── engine-multi/ # Multi-process wrapper +├── lexicon/ # Shared TypeScript types +└── logger/ # Logging utilities +``` + +## Testing & Releases + +```bash +pnpm test # All tests +pnpm test:types # Type checking +pnpm test:integration # Integration tests +cd packages/cli && pnpm test:watch # Watch mode +``` + +## Testing Best Practice + +- Ensure tests are valuable before generating them. Focus on what's important. +- Treat tests as documentation: they should show how the function is expected to work +- Keep tests focuses: test one thing in each test +- This repo contains extensive testing: check for similar patterns in the same package before improvising + +## Additional Documentation + +**Changesets**: Run `pnpm changeset` when submitting PRs. Releases publish automatically to npm on merge to main. + +The [.claude](.claude) folder contains detailed guides: + +- **[command-refactor.md](.claude/command-refactor.md)** - Refactoring CLI commands into project subcommand structure +- **[event-processor.md](.claude/event-processor.md)** - Worker event processing architecture (batching, ordering) + +## Code Standards + +- **Formatting**: Use Prettier (`pnpm format`) +- **TypeScript**: Required for all new code +- **TypeSync**: Run `pnpm typesync` after modifying dependencies +- **Tests**: Write tests and run `pnpm build` before testing (tests run against `dist/`) +- **Independence**: Keep packages loosely coupled where possible + +## Architecture Principles + +- **Separation of Concerns**: CLI and Worker are frontends; Runtime is the shared execution backend +- **Sandboxing**: Runtime uses Node's VM module for isolation +- **State Immutability**: State cannot be mutated between jobs +- **Portability**: Compiled jobs are standard ES modules +- **Zero Persistence (Worker)**: Worker is stateless; Lightning handles persistence +- **Multi-Process Isolation**: Worker uses engine-multi for concurrent workflow execution + +## Contributing + +1. Make changes +2. Run `pnpm test` +3. Add changeset: `pnpm changeset` +4. Open PR at https://github.com/openfn/kit + +**Resources**: [docs.openfn.org](https://docs.openfn.org) | [app.openfn.org](https://app.openfn.org) | [github.com/openfn/kit](https://github.com/openfn/kit) diff --git a/packages/cli/src/projects/fetch.ts b/packages/cli/src/projects/fetch.ts index d10c4626e..8ef164461 100644 --- a/packages/cli/src/projects/fetch.ts +++ b/packages/cli/src/projects/fetch.ts @@ -14,6 +14,7 @@ import { loadAppAuthConfig, getSerializePath, } from './util'; +import { writeFile } from 'node:fs/promises'; export type FetchOptions = Pick< Opts, @@ -23,6 +24,7 @@ export type FetchOptions = Pick< | 'endpoint' | 'env' | 'force' + | 'format' | 'log' | 'logJson' | 'snapshots' @@ -45,6 +47,7 @@ const options = [ po.outputPath, po.env, po.workspace, + po.format, ]; const command: yargs.CommandModule = { @@ -68,24 +71,72 @@ export default command; const printProjectName = (project: Project) => `${project.qname} (${project.id})`; -export const handler = async (options: FetchOptions, logger: Logger) => { +const fetchV1 = async (options: FetchOptions, logger: Logger) => { const workspacePath = options.workspace ?? process.cwd(); logger.debug('Using workspace at', workspacePath); const workspace = new Workspace(workspacePath, logger, false); - const { outputPath } = options; + // TODO we may need to resolve an alias to a UUID and endpoint + const localProject = workspace.get(options.project!); + if (localProject) { + logger.debug( + `Resolved "${options.project}" to local project ${printProjectName( + localProject + )}` + ); + } else { + logger.debug( + `Failed to resolve "${options.project}" to local project. Will send request to app anyway.` + ); + } - const localTargetProject = await resolveOutputProject( - workspace, - options, + const config = loadAppAuthConfig(options, logger); + + const { data } = await fetchProject( + options.endpoint ?? localProject?.openfn?.endpoint!, + config.apiKey, + localProject?.uuid ?? options.project!, logger ); + const finalOutputPath = getSerializePath( + localProject, + options.workspace, + options.outputPath + ); + + logger.success(`Fetched project file to ${finalOutputPath}`); + await writeFile(finalOutputPath, JSON.stringify(data, null, 2)); + + // TODO should we return a Project or just the raw state? + return data; +}; + +export const handler = async (options: FetchOptions, logger: Logger) => { + if (options.format === 'state') { + return fetchV1(options, logger); + } + return fetchV2(options, logger); +}; + +export const fetchV2 = async (options: FetchOptions, logger: Logger) => { + const workspacePath = options.workspace ?? process.cwd(); + logger.debug('Using workspace at', workspacePath); + + const workspace = new Workspace(workspacePath, logger, false); + const { outputPath } = options; + const remoteProject = await fetchRemoteProject(workspace, options, logger); - ensureTargetCompatible(options, remoteProject, localTargetProject); + if (!options.force && options.format == 'state') { + const localTargetProject = await resolveOutputProject( + workspace, + options, + logger + ); - // TODO should we use the local target project for output? + ensureTargetCompatible(options, remoteProject, localTargetProject); + } // Work out where and how to serialize the project const finalOutputPath = getSerializePath( @@ -94,7 +145,7 @@ export const handler = async (options: FetchOptions, logger: Logger) => { outputPath ); - let format: undefined | 'json' | 'yaml' = undefined; + let format: undefined | 'json' | 'yaml' | 'state' = options.format; if (outputPath) { // If the user gave us a path for output, we need to respect the format we've been given const ext = path.extname(outputPath!).substring(1) as any; @@ -112,12 +163,14 @@ export const handler = async (options: FetchOptions, logger: Logger) => { // TODO report whether we've updated or not // finally, write it! - await serialize(remoteProject, finalOutputPath!, format as any); - - logger.success( - `Fetched project file to ${finalOutputPath}.${format ?? 'yaml'}` + const finalPathWithExt = await serialize( + remoteProject, + finalOutputPath!, + format as any ); + logger.success(`Fetched project file to ${finalPathWithExt}`); + return remoteProject; }; @@ -193,7 +246,7 @@ export async function fetchRemoteProject( localProject?.openfn?.uuid && localProject.openfn.uuid !== options.project ) { - // ifwe resolve the UUID to something other than what the user gave us, + // if we resolve the UUID to something other than what the user gave us, // debug-log the UUID we're actually going to use projectUUID = localProject.openfn.uuid as string; logger.debug( diff --git a/packages/cli/src/projects/options.ts b/packages/cli/src/projects/options.ts index dc35a74b8..5c32399a4 100644 --- a/packages/cli/src/projects/options.ts +++ b/packages/cli/src/projects/options.ts @@ -9,6 +9,7 @@ export type Opts = BaseOpts & { removeUnmapped?: boolean | undefined; workflowMappings?: Record | undefined; project?: string; + format?: 'yaml' | 'json' | 'state'; }; // project specific options @@ -36,6 +37,15 @@ export const dryRun: CLIOption = { }, }; +export const format: CLIOption = { + name: 'format', + yargs: { + hidden: true, + description: + 'The format to save the project as - state, yaml or json. Use this to download raw state files.', + }, +}; + export const removeUnmapped: CLIOption = { name: 'remove-unmapped', yargs: { diff --git a/packages/cli/src/projects/util.ts b/packages/cli/src/projects/util.ts index e9202ba50..3e8817a08 100644 --- a/packages/cli/src/projects/util.ts +++ b/packages/cli/src/projects/util.ts @@ -44,26 +44,30 @@ const ensureExt = (filePath: string, ext: string) => { }; export const getSerializePath = ( - project: Project, - workspacePath: string, + project?: Project, + workspacePath?: string, outputPath?: string ) => { - const outputRoot = resolvePath(outputPath || workspacePath); + const outputRoot = resolvePath(outputPath || workspacePath || '.'); const projectsDir = project?.config.dirs.projects ?? '.projects'; - return outputPath ?? `${outputRoot}/${projectsDir}/${project.qname}`; + return outputPath ?? `${outputRoot}/${projectsDir}/${project?.qname}`; }; export const serialize = async ( project: Project, outputPath: string, - formatOverride?: 'yaml' | 'json', + formatOverride?: 'yaml' | 'json' | 'state', dryRun = false ) => { const root = path.dirname(outputPath); await mkdir(root, { recursive: true }); const format = formatOverride ?? project.config?.formats.project; - const output = project?.serialize('project', { format }); + + const output = + format === 'state' + ? project?.serialize('state', { format: 'json' }) + : project?.serialize('project', { format }); const maybeWriteFile = (filePath: string, output: string) => { if (!dryRun) { diff --git a/packages/cli/test/projects/deploy.test.ts b/packages/cli/test/projects/deploy.test.ts index 5809fe736..50bb040a8 100644 --- a/packages/cli/test/projects/deploy.test.ts +++ b/packages/cli/test/projects/deploy.test.ts @@ -1,10 +1,36 @@ +import { readFile, writeFile } from 'node:fs/promises'; import test from 'ava'; +import mock from 'mock-fs'; +import path from 'node:path'; import Project, { generateWorkflow } from '@openfn/project'; import { createMockLogger } from '@openfn/logger'; -import { reportDiff } from '../../src/projects/deploy'; +import createLightningServer, { + DEFAULT_PROJECT_ID, +} from '@openfn/lightning-mock'; + +import { + handler as deployHandler, + reportDiff, +} from '../../src/projects/deploy'; +import { myProject_yaml, myProject_v1 } from './fixtures'; +import { checkout } from '../../src/projects'; const logger = createMockLogger(undefined, { level: 'debug' }); +const port = 9876; +const ENDPOINT = `http://localhost:${port}`; + +let server: any; + +test.before(async () => { + server = await createLightningServer({ port }); +}); + +test.beforeEach(() => { + server.addProject(myProject_v1); + logger._reset(); +}); + // what will deploy tests look like? // deploy a project for the first time (this doesn't work though?) @@ -148,3 +174,47 @@ test('reportDiff: should report mix of added, changed, and removed workflows', ( t.truthy(logger._find('always', /workflows removed/i)); t.truthy(logger._find('always', /- c/i)); }); + +test.serial.only( + 'deploy a change to a project and write the yaml back', + async (t) => { + // Mock the filesystem + mock({ + '/ws/.projects/main@test.yaml': myProject_yaml, + '/ws/openfn.yaml': '', + }); + + // first checkout the project + await checkout( + { + project: 'main', + workspace: '/ws', + }, + logger + ); + + // Now change the expression + await writeFile('/ws/workflows/my-workflow/transform-data.js', 'log()'); + + // TODO by testing deploy like a closed box like, and with the lightning mock, + // it's hard to see what is actually sent + // but its my mock, it should be able to help with this! + await deployHandler( + { + endpoint: ENDPOINT, + apiKey: 'test-api-key', + workspace: '/ws', + force: true, // TODO hoping to remove this soon + } as any, + logger + ); + console.log(logger._history.map((l) => l.at(-1))); + + const expected = myProject_yaml.replace('fn()', 'log()'); + const projectYaml = await readFile('/ws/.projects/main@test.yaml', 'utf8'); + t.is(projectYaml, expected); + + const success = logger._find('success', /Updated project at/); + t.truthy(success); + } +); diff --git a/packages/cli/test/projects/fetch.test.ts b/packages/cli/test/projects/fetch.test.ts index ff6ef9922..301d656e7 100644 --- a/packages/cli/test/projects/fetch.test.ts +++ b/packages/cli/test/projects/fetch.test.ts @@ -480,7 +480,7 @@ test.serial( .replace('fn()', 'fn(x)') // arbitrary edit so that we can track the change .replace(' - a', ' - z'); // change the local history to be incompatible - // Make it look like we've checked out hte project + // Make it look like we've checked out the project mock({ '/ws/.projects': {}, '/ws/openfn.yaml': '',