Could  "Any-Latin; Latin-ASCII" be added to replace_non_ascii() to address logographics/cyrillic/devanagari?

I see that `replace_non_ascii()` uses `stringi::stri_trans_general(x, "latin-ascii")`

This doesn't seem to work for logographic, Cyrillic, or Devanagari characters:

```r
library(stringi)
x <-  c("キャンパス", "재미", "wylądować", "Дорога", "heiß", "Raül", 'brûlée', "भोजन")
Encoding(x) <- "UTF-8"
stri_trans_general(in_str, id = "Latin-ASCII")
```
```
[1] "キャンパス" "재미"       "wyladowac"  "Дорога"     "heiss"     
[6] "Raul"       "brulee"     "भोजन" 
```
The function could first transliterate to Any-Latin and then to Latin-ASCII, which seems a safer default:

```r
stri_trans_general(x, id = "Any-Latin; Latin-ASCII")
```
```
[1] "kyanpasu"  "jaemi"     "wyladowac" "Doroga"    "heiss"     "Raul"     
[7] "brulee"    "bhojana"
```

Just a thought -- love the package!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Could "Any-Latin; Latin-ASCII" be added to replace_non_ascii() to address logographics/cyrillic/devanagari? #64

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Could "Any-Latin; Latin-ASCII" be added to replace_non_ascii() to address logographics/cyrillic/devanagari? #64

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions