Skip to content

Commit 9de8e39

Browse files
authored
[Python] Fixes for string escapes, placeholders, replacements (sublimehq#4366)
* [Python] Scope placeholders in raw strings This commit includes `string-placeholders` context to raw strings as all of them can be used as format-strings in e.g. `R"Hello %s" % R'World'`. * [Python] Reorder context includes This commit reorders include statements to reduce syntax cache size. * [Python] Fix string replacements This commit... 1. moves string-replacement includes before includes of normal escapes to make sure special `{{` and `\{{` escape patterns take precedence. 2. removes string-replacement includes from b-strings as those don't support format methods such as `b"{0}".format(b"invalid")` and string format placeholders are not scoped as well. 3. adds string-replacement includes to plain raw strings, as those can be used in format strings and string format placeholders have been added, before. Notes: - raw SQL strings already include them - raw RegExp strings need more work to prevent ambiguities with braced quantifiers such as `{1}`, hence don't include them at the moment. * [Python] Drop unicode escapes from raw-strings This commit removes `escaped-unicode-chars` context includes from all raw string content contexts as each of them returns e.g. `\u2020` unchanged. * [Python] Merge escaped string replacement braces As a result of reordering string-replacement contexts, it turns out f-strings and normal string-replacements sharing same brace escaping rules. * [Python] Move and extend some f-string tests This commit... 1. moves various tests to group f-string tests. 2. adds tests for escape sequences and placeholders in all sorts of f-strings. * [Python] Add tests to verify recent changes This commit adds tests for all sorts of (except f-) strings to verify escape sequences, placeholders and string replacements being applied as expected. * [Python] Restrict known regexp escape sequences This commit overrides `known_char_escape` variable in python's regexp syntax to 1. scope all `\c` sequences illegal, even when followed by numbers, as python's `re` and `regex` modules don't support it. 2. remove explicit patterns `\\[tnrfae]` as those are handled by default escape pattern `\.`.
1 parent 91ad808 commit 9de8e39

File tree

3 files changed

+667
-99
lines changed

3 files changed

+667
-99
lines changed

Python/Embeddings/RegExp (for Python).sublime-syntax

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ hidden: true
88
extends: Packages/Regular Expressions/RegExp (Basic).sublime-syntax
99

1010
variables:
11+
# escapes
12+
known_char_escape: \\(?:[0-7]{3}|x\{\h{1,7}\}|x\h\h)
13+
1114
# modifiers
1215
activate_x_mode: (?:\?[imsLua]*x[ixmsLua]*(?:-[imsLua]+)?)
1316
deactivate_x_mode: (?:\?[imsLua]*-[imsLua]*x[imxsLua]*)

Python/Python.sublime-syntax

Lines changed: 33 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -2806,8 +2806,9 @@ contexts:
28062806

28072807
triple-double-quoted-plain-raw-u-string-content:
28082808
- include: string-prototype
2809+
- include: string-placeholders
2810+
- include: triple-double-quoted-string-replacements
28092811
- include: escaped-raw-quotes
2810-
- include: escaped-unicode-chars
28112812

28122813
triple-double-quoted-raw-u-string:
28132814
# Triple-quoted raw string, unicode or not, will detect SQL, otherwise regex
@@ -2841,7 +2842,7 @@ contexts:
28412842

28422843
triple-double-quoted-regexp-raw-u-string-content:
28432844
- include: string-prototype
2844-
- include: escaped-unicode-chars
2845+
- include: string-placeholders
28452846

28462847
triple-double-quoted-sql-raw-u-string-body:
28472848
- meta_include_prototype: false
@@ -2855,10 +2856,9 @@ contexts:
28552856

28562857
triple-double-quoted-sql-raw-u-string-content:
28572858
- include: string-prototype
2858-
- include: escaped-raw-quotes
2859-
- include: escaped-unicode-chars
28602859
- include: string-placeholders
28612860
- include: triple-double-quoted-string-replacements
2861+
- include: escaped-raw-quotes
28622862

28632863
triple-double-quoted-b-string:
28642864
# Triple-quoted string, bytes, no syntax embedding
@@ -2877,9 +2877,8 @@ contexts:
28772877
triple-double-quoted-b-string-content:
28782878
- include: string-prototype
28792879
- include: string-continuations
2880-
- include: escaped-chars
28812880
- include: string-placeholders
2882-
- include: triple-double-quoted-string-replacements
2881+
- include: escaped-chars
28832882

28842883
triple-double-quoted-f-string:
28852884
# Triple-quoted f-string or t-string
@@ -2942,10 +2941,10 @@ contexts:
29422941
triple-double-quoted-u-string-content:
29432942
- include: string-prototype
29442943
- include: string-continuations
2945-
- include: escaped-unicode-chars
2946-
- include: escaped-chars
29472944
- include: string-placeholders
29482945
- include: triple-double-quoted-string-replacements
2946+
- include: escaped-unicode-chars
2947+
- include: escaped-chars
29492948

29502949
triple-double-quoted-string-replacements:
29512950
- include: escaped-string-braces
@@ -2989,7 +2988,7 @@ contexts:
29892988
fail: triple-double-quoted-string-replacement
29902989

29912990
triple-double-quoted-f-string-replacements:
2992-
- include: escaped-f-string-braces
2991+
- include: escaped-string-braces
29932992
- include: invalid-f-string-replacements
29942993
- match: \{
29952994
scope: punctuation.section.interpolation.begin.python
@@ -3158,6 +3157,8 @@ contexts:
31583157

31593158
double-quoted-plain-raw-u-string-content:
31603159
- include: string-prototype
3160+
- include: string-placeholders
3161+
- include: double-quoted-string-replacements
31613162
- include: escaped-raw-quotes
31623163

31633164
double-quoted-raw-u-string:
@@ -3193,6 +3194,7 @@ contexts:
31933194

31943195
double-quoted-regexp-raw-u-string-content:
31953196
- include: string-prototype
3197+
- include: string-placeholders
31963198

31973199
double-quoted-sql-raw-u-string-body:
31983200
- meta_include_prototype: false
@@ -3206,9 +3208,9 @@ contexts:
32063208

32073209
double-quoted-sql-raw-u-string-content:
32083210
- include: string-prototype
3209-
- include: escaped-raw-quotes
32103211
- include: string-placeholders
32113212
- include: double-quoted-string-replacements
3213+
- include: escaped-raw-quotes
32123214

32133215
double-quoted-b-string:
32143216
# Single-line string, bytes
@@ -3226,9 +3228,8 @@ contexts:
32263228

32273229
double-quoted-b-string-content:
32283230
- include: string-prototype
3229-
- include: escaped-chars
32303231
- include: string-placeholders
3231-
- include: double-quoted-string-replacements
3232+
- include: escaped-chars
32323233

32333234
double-quoted-f-string:
32343235
# Single-line f-string or t-string
@@ -3289,10 +3290,10 @@ contexts:
32893290

32903291
double-quoted-u-string-content:
32913292
- include: string-prototype
3292-
- include: escaped-unicode-chars
3293-
- include: escaped-chars
32943293
- include: string-placeholders
32953294
- include: double-quoted-string-replacements
3295+
- include: escaped-unicode-chars
3296+
- include: escaped-chars
32963297

32973298
double-quoted-string-replacements:
32983299
- include: escaped-string-braces
@@ -3336,7 +3337,7 @@ contexts:
33363337
fail: double-quoted-string-replacement
33373338

33383339
double-quoted-f-string-replacements:
3339-
- include: escaped-f-string-braces
3340+
- include: escaped-string-braces
33403341
- include: invalid-f-string-replacements
33413342
- match: \{
33423343
scope: punctuation.section.interpolation.begin.python
@@ -3496,6 +3497,8 @@ contexts:
34963497

34973498
triple-single-quoted-plain-raw-u-string-content:
34983499
- include: string-prototype
3500+
- include: string-placeholders
3501+
- include: triple-single-quoted-string-replacements
34993502
- include: escaped-raw-quotes
35003503

35013504
triple-single-quoted-raw-u-string:
@@ -3530,7 +3533,7 @@ contexts:
35303533

35313534
triple-single-quoted-regexp-raw-u-string-content:
35323535
- include: string-prototype
3533-
- include: escaped-unicode-chars
3536+
- include: string-placeholders
35343537

35353538
triple-single-quoted-sql-raw-u-string-body:
35363539
- meta_include_prototype: false
@@ -3544,10 +3547,9 @@ contexts:
35443547

35453548
triple-single-quoted-sql-raw-u-string-content:
35463549
- include: string-prototype
3547-
- include: escaped-raw-quotes
3548-
- include: escaped-unicode-chars
35493550
- include: string-placeholders
35503551
- include: triple-single-quoted-string-replacements
3552+
- include: escaped-raw-quotes
35513553

35523554
triple-single-quoted-b-string:
35533555
# Triple-quoted string, bytes, no syntax embedding
@@ -3566,9 +3568,8 @@ contexts:
35663568
triple-single-quoted-b-string-content:
35673569
- include: string-prototype
35683570
- include: string-continuations
3569-
- include: escaped-chars
35703571
- include: string-placeholders
3571-
- include: triple-single-quoted-string-replacements
3572+
- include: escaped-chars
35723573

35733574
triple-single-quoted-f-string:
35743575
# Triple-quoted f-string or t-string
@@ -3631,10 +3632,10 @@ contexts:
36313632
triple-single-quoted-u-string-content:
36323633
- include: string-prototype
36333634
- include: string-continuations
3634-
- include: escaped-unicode-chars
3635-
- include: escaped-chars
36363635
- include: string-placeholders
36373636
- include: triple-single-quoted-string-replacements
3637+
- include: escaped-unicode-chars
3638+
- include: escaped-chars
36383639

36393640
triple-single-quoted-string-replacements:
36403641
- include: escaped-string-braces
@@ -3678,7 +3679,7 @@ contexts:
36783679
fail: triple-single-quoted-string-replacement
36793680

36803681
triple-single-quoted-f-string-replacements:
3681-
- include: escaped-f-string-braces
3682+
- include: escaped-string-braces
36823683
- include: invalid-f-string-replacements
36833684
- match: \{
36843685
scope: punctuation.section.interpolation.begin.python
@@ -3806,6 +3807,8 @@ contexts:
38063807

38073808
single-quoted-plain-raw-u-string-content:
38083809
- include: string-prototype
3810+
- include: string-placeholders
3811+
- include: single-quoted-string-replacements
38093812
- include: escaped-raw-quotes
38103813

38113814
single-quoted-raw-u-string:
@@ -3841,6 +3844,7 @@ contexts:
38413844

38423845
single-quoted-regexp-raw-u-string-content:
38433846
- include: string-prototype
3847+
- include: string-placeholders
38443848

38453849
single-quoted-sql-raw-u-string-body:
38463850
- meta_include_prototype: false
@@ -3854,9 +3858,9 @@ contexts:
38543858

38553859
single-quoted-sql-raw-u-string-content:
38563860
- include: string-prototype
3857-
- include: escaped-raw-quotes
38583861
- include: string-placeholders
38593862
- include: single-quoted-string-replacements
3863+
- include: escaped-raw-quotes
38603864

38613865
single-quoted-plain-raw-f-string:
38623866
# Single-line raw f-string
@@ -3915,9 +3919,8 @@ contexts:
39153919

39163920
single-quoted-b-string-content:
39173921
- include: string-prototype
3918-
- include: escaped-chars
39193922
- include: string-placeholders
3920-
- include: single-quoted-string-replacements
3923+
- include: escaped-chars
39213924

39223925
single-quoted-f-string:
39233926
# Single-line f-string or t-string
@@ -3978,10 +3981,10 @@ contexts:
39783981

39793982
single-quoted-u-string-content:
39803983
- include: string-prototype
3981-
- include: escaped-unicode-chars
3982-
- include: escaped-chars
39833984
- include: string-placeholders
39843985
- include: single-quoted-string-replacements
3986+
- include: escaped-unicode-chars
3987+
- include: escaped-chars
39853988

39863989
single-quoted-string-replacements:
39873990
- include: escaped-string-braces
@@ -4025,7 +4028,7 @@ contexts:
40254028
fail: single-quoted-string-replacement
40264029

40274030
single-quoted-f-string-replacements:
4028-
- include: escaped-f-string-braces
4031+
- include: escaped-string-braces
40294032
- include: invalid-f-string-replacements
40304033
- match: \{
40314034
scope: punctuation.section.interpolation.begin.python
@@ -4099,7 +4102,7 @@ contexts:
40994102
# but don't scope them as escapes as they are treated literal
41004103
- match: \\[\\'"]
41014104

4102-
escaped-f-string-braces:
4105+
escaped-string-braces:
41034106
# https://peps.python.org/pep-0498
41044107
# https://peps.python.org/pep-0701
41054108
# https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings
@@ -4108,10 +4111,6 @@ contexts:
41084111
captures:
41094112
1: invalid.deprecated.character.escape.python
41104113

4111-
escaped-string-braces:
4112-
- match: \{\{|\}\}
4113-
scope: constant.character.escape.python
4114-
41154114
string-placeholders:
41164115
- match: |- # printf style
41174116
(?x)

0 commit comments

Comments
 (0)