Skip to content
@U4RASD

Unit for Research In Arabic Social and Digital Spaces

Part of the Arab Center for Research and Policy Studies (ACRPS)

Popular repositories Loading

  1. dalla-model-training dalla-model-training Public

    Dalla training recipe using Huggingface SFT trainer

    Python 7

  2. r-bpe r-bpe Public

    R-BPE: Improving BPE-Tokenizers with Token Reuse

    Python 4 2

  3. dalla-data-processing dalla-data-processing Public

    data processing pipeline used for the DALLA Models.

    C

  4. dalla-sentencepiece dalla-sentencepiece Public

    A tool to update an existing tokenizer model with new and common tokens from a newly trained tokenizer model. It works with any SentencePiece model.

    Python

  5. .github .github Public

  6. ar-ms-baseline ar-ms-baseline Public

    Training code to establish baseline for NAKBA NLP 2026: Arabic Manuscript Understanding Shared Task

    Python

Repositories

Showing 6 of 6 repositories
  • dalla-data-processing Public

    data processing pipeline used for the DALLA Models.

    U4RASD/dalla-data-processing’s past year of commit activity
    C 0 0 1 0 Updated Jan 18, 2026
  • ar-ms-baseline Public

    Training code to establish baseline for NAKBA NLP 2026: Arabic Manuscript Understanding Shared Task

    U4RASD/ar-ms-baseline’s past year of commit activity
    Python 0 0 0 0 Updated Jan 14, 2026
  • dalla-model-training Public

    Dalla training recipe using Huggingface SFT trainer

    U4RASD/dalla-model-training’s past year of commit activity
    Python 7 0 0 0 Updated Dec 16, 2025
  • dalla-sentencepiece Public

    A tool to update an existing tokenizer model with new and common tokens from a newly trained tokenizer model. It works with any SentencePiece model.

    U4RASD/dalla-sentencepiece’s past year of commit activity
    Python 0 0 0 0 Updated Dec 15, 2025
  • r-bpe Public

    R-BPE: Improving BPE-Tokenizers with Token Reuse

    U4RASD/r-bpe’s past year of commit activity
    Python 4 2 0 1 Updated Nov 26, 2025
  • .github Public
    U4RASD/.github’s past year of commit activity
    0 0 0 0 Updated Nov 26, 2025

Top languages

Loading…

Most used topics

Loading…