refactor: decompose lexer into functional submodules #264
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following the modularization of the parser and codegen, this PR breaks down the monolithic
lexer.rsinto specialized submodules withinfront/lexer/src/lexer/. This reorganization separates low-level source navigation from high-level token dispatch and literal parsing, making the lexer significantly easier to maintain and extend.Key Changes
1. Lexer Modularization
The lexer logic has been split into the following functional components:
core.rs: Definitions for theLexerandTokenstructures, serving as the foundational types.cursor.rs: Implements low-level source navigation methods such asadvance(),peek(),peek_next(), andmatch_next().scan.rs: The primary entry point for tokenization, containing the mainnext_token()dispatch logic and character-level matching.ident.rs: Logic for scanning identifiers and mapping them to language keywords.literals.rs: Specialized scanning for string and character literals, including escape sequence handling.trivia.rs: Logic for skipping non-token "trivia" such as whitespace and various comment styles.common.rs: Internal shared imports and utilities used across the lexer submodules.2. Integration & API Cleanup
front/lexer/src/lib.rsandmod.rsto correctly export the new modular structure while maintaining a clean public API.front/parsercrate to align with the new lexer paths, specifically ensuringTokenTypeandTokenare correctly referenced.3. Behavioral Consistency
Lexerpublic interface remains stable to prevent breaking changes in the compiler runner.Impact
cursor.rs, while adding new keywords only involvesident.rs.