Skip to content

Commit 82c6238

Browse files
committed
add more language support
1 parent 3d58857 commit 82c6238

File tree

5 files changed

+104
-34
lines changed

5 files changed

+104
-34
lines changed

CHANGELOG.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,24 @@
1+
# 1.3.0 (2025-03-26)
2+
Support various custom numbering styles in a more clear and flexible way:
3+
- support various and auto-extend numbering symbols: arabic numbers, roman numbers, latin/greek/cyrillic letters (both upper and lower case), Chinese numbers, etc.
4+
- support custom numbering symbols for anything: any level of section, figure, table, equation, theorem, etc.
5+
6+
Support appendix numbering.
7+
8+
Metadata `{item_type}-symbols` and fields `{item_type}_sym` are no longer supported.
9+
10+
## Migration Guide
11+
For people who are using 1.2.x version, there are some **removals**:
12+
13+
The following metadata keys have already marked as deprecated in the previous version, and they are now **removed**:
14+
- metadata keys `section-format-source-i` and `section-format-ref-i` for the i-the level section numbering formatting are now removed. You should use the new `section-src-format-i` and `section-cref-format-i` keys instead.
15+
16+
The following features are now **removed**, and you should use the new API instead:
17+
- metadata keys `{item_type}-symbols` and formatting fields `{item_type}_sym` are now removed. Now you can simply use `{item_type}-numstyle` to specify the numbering style directly.
18+
119
# 1.2.5 (2025-03-13)
220
Support theorem numbering. Also refer to a [StackExchange question](https://tex.stackexchange.com/questions/738132/simultaneously-cross-referencing-numbered-amsthm-theorems-and-numbered-equations).
321

4-
522
# 1.2.4 (2025-03-07)
623
Support customizable spacing command in the `equation-src-format` field (default now is `"\\quad({num})"`). Also refer to issue #11.
724

README.md

Lines changed: 51 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# pandoc-tex-numbering
2-
This is an all-in-one pandoc filter for converting your LaTeX files to any format while keeping **numbering, hyperlinks, caption formats and (clever) cross references in (maybe multi-line) equations, sections, figures, tables and theorems**. The formating is highly customizable, easy-to-use, and even more flexible than the LaTeX default.
2+
This is an all-in-one pandoc filter for converting your LaTeX files to any format while keeping **numbering, hyperlinks, caption formats and (clever) cross references in (maybe multi-line) equations, sections, figures, tables, theorems and appendices**. The formating is highly customizable, easy-to-use, and even more flexible than the LaTeX default.
33

44
# Contents
55
- [pandoc-tex-numbering](#pandoc-tex-numbering)
@@ -11,6 +11,7 @@ This is an all-in-one pandoc filter for converting your LaTeX files to any forma
1111
- [Quick Start](#quick-start)
1212
- [Customization](#customization)
1313
- [General](#general)
14+
- [Numbering System](#numbering-system)
1415
- [Formatting System](#formatting-system)
1516
- [Prefix-based System](#prefix-based-system)
1617
- [Custom Formatting System (f-string formatting)](#custom-formatting-system-f-string-formatting)
@@ -20,8 +21,7 @@ This is an all-in-one pandoc filter for converting your LaTeX files to any forma
2021
- [Theorems](#theorems)
2122
- [List of Figures and Tables](#list-of-figures-and-tables)
2223
- [Multiple References](#multiple-references)
23-
- [Details](#details)
24-
- [Equations](#equations-1)
24+
- [Appendix](#appendix)
2525
- [List of Figures and Tables](#list-of-figures-and-tables-1)
2626
- [Data Export](#data-export)
2727
- [Log](#log)
@@ -41,7 +41,8 @@ This is an all-in-one pandoc filter for converting your LaTeX files to any forma
4141
- **`cleveref` Package**: `cref` and `Cref` commands are supported. You can customize the prefix of the references.
4242
- **Subfigures**: `subcaption` package is supported. Subfigures can be numbered with customized symbols and formats.
4343
- **Theorems**: Theorems are supported with customized formats.
44-
- **Non-Arabic Numbers**: Chinese numbers "第一章", "第二节" etc. are supported. You can customize the numbering format.
44+
- **Appendices**: Appendices are supported with customized formats.
45+
- **Non-Arabic Numbers**: Various non-arabic numbers are supported, such as Latin letters, Chinese, Roman, Greek, Cyrillic, etc.
4546
- **Custom List of Figures and Tables**: **Short captions** as well as custom lof/lot titles are supported for figures and tables.
4647
- **Custom Formatting of Everything**: You can customize the format of the numbering and references with python f-string format based on various fields we provide.
4748

@@ -84,6 +85,26 @@ You can set the following variables in the metadata of your LaTeX file to custom
8485
- `data-export-path`: Where to export the filter data. Default is `None`, which means no data will be exported. If set, the data will be exported to the specified path in the JSON format. This is useful for further usage of the filter data in other scripts or filter-debugging.
8586
- `auto-labelling`: Whether to automatically add identifiers (labels) to figures and tables without labels. Default is `true`. This has no effect on the output appearance but can be useful for cross-referencing in the future (for example, in the `.docx` output this will ensure that all your figures and tables have a unique auto-generated bookmark).
8687

88+
## Numbering System
89+
- `{item_type}-numstyle`: The style of the numbering of figures, tables, equations, sections, theorems, subfigures. For example `figure-numstyle` represents the style of the numbering of figures.
90+
- `{item_type}-numstyle-{i}`: The style of the i-th level of the numbering of sections or appendices. For example, `section-numstyle-1` represents the style of the first level of the numbering of sections.
91+
92+
Possible values are:
93+
- `arabic`: Arabic numbers (1, 2, 3, ...)
94+
- `roman`: Lowercase Roman numbers (i, ii, iii, ...)
95+
- `Roman`: Uppercase Roman numbers (I, II, III, ...)
96+
- `latin`: Lowercase Latin numbers (a, b, c, ...)
97+
- `Latin`: Uppercase Latin numbers (A, B, C, ...)
98+
- `greek`: Lowercase Greek numbers (α, β, γ, ...)
99+
- `Greek`: Uppercase Greek numbers (Α, Β, Γ, ...)
100+
- `cyrillic`: Lowercase Cyrillic numbers (а, б, в, ...)
101+
- `Cyrillic`: Uppercase Cyrillic numbers (А, Б, В, ...)
102+
- `zh`: Chinese numbers (一, 二, 三, ...)
103+
104+
Default values of most of the items are `arabic`. Exceptions are:
105+
- Default value of `subfigure-numstyle` is `latin`.
106+
- Default value of `appendix-numstyle-1` is `Latin`.
107+
87108
## Formatting System
88109

89110
We support a very flexible formatting system for the numbering and references. There are two different formatting systems for the numbering and references. You can use them together. The two systems are:
@@ -119,18 +140,28 @@ For sections, every level has its own formatting. You can set the metadata, for
119140
For equations, the default `src` format (i.e. `equation-src-format`) is `"\\qquad({num})"`. `\qquad` is used to offer a little space between the equation and the number. You can customize it as you like.
120141

121142
#### Metadata Values
122-
The metadata values are python f-string format strings. Various fields are provided for you to customize the format. For example, if you set the `number-reset-level` to 2, `figure-prefix` to `figure` and `prefix-space` to `True`. Then, the fifth figure under subsection 2.3 will have the following fields:
123-
- `num`: `2.3.5`
124-
- `parent_num`: `2.3`
143+
The metadata values are python f-string format strings. Various fields are provided for you to customize the format. For example, if you have the following settings:
144+
- `number-reset-level`: `2`
145+
- `figure-prefix`:`"figure"`
146+
- `prefix-space` to `True`.
147+
- `section-numstyle-1`: `"Roman"`
148+
- `figure-numstyle`: `"latin"`
149+
150+
Then, the fifth figure under subsection 2.3 will have the following fields:
151+
- `num`: `II.3.e`
152+
- `parent_num`: `II.3`
153+
- `this_num`: `e` (note that the fields ended with `_num` will keep the numbering style settings)
125154
- `fig_id`: `5`
126155
- `prefix`: `figure ` (note the space at the end)
127156
- `Prefix`: `Figure `
128157
- `h1`: `2`
129158
- `h2`: `3`(note that `h2` is accessible only when the `number-reset-level` >= 2 and so on)
130-
- `h1_zh`: `` (Chinese number support)
131-
- `h2_zh`: ``
132-
133-
For the subfigures, a special field `subfig_sym` is provided to represent the symbol of the subfigure. For example, if you set the `subfigure-symbols` metadata to `"αβγδ"`, the second subfigure will have the `subfig_sym` field as `"β"` while the `subfig_id` field as `2`.
159+
- `h1_zh`: ``
160+
- `h1_roman`: `ii`
161+
- `h1_Roman`: `II`
162+
- `h1_latin`: `b`
163+
- `h1_Latin`: `B`
164+
- ... (any supported languages or symbols, see the [Numbering System](#numbering-system) section)
134165

135166
Here are some examples of the metadata values:
136167
- set the `fig-src-format` metadata to `"{prefix}{num}"`, the numbering before its caption will be shown as "Figure 2.3.5"
@@ -168,6 +199,15 @@ For more details, see the [List of Figures and Tables](#list-of-figures-and-tabl
168199

169200
NOTE: in case of setting metadata in a yaml file, the spaces at the beginning and the end of the values are by default stripped. Therefore, if you want to keep the spaces in the yaml metadata file, **you should mannually escape those spaces via double slashes.** For example, if you want set `multiple-ref-last-separator` to `" and "` (spaces appear at the beginning and the end), you should set it as `"\\ and\\ "` in the yaml file. See pandoc's [issue #10539](https://github.com/jgm/pandoc/issues/10539) for more further discussions.
170201

202+
## Appendix
203+
- `appendix-names`: The names of the appendices separated by "/,". If you have this in your tex file:
204+
```latex
205+
\appendix
206+
\chapter{First Appendix}
207+
\chapter{Second Appendix}
208+
```
209+
You should set the metadata `appendix-names` to `"First Appendix/,Second Appendix"`. Note that the names should be separated by `"/,"`, not by `","` (so as to avoid conflicts with the commas in the names).
210+
171211
# Details
172212
173213
## Equations

src/pandoc_tex_numbering/lang_num.py

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ def arabic2chinese(num):
3333
result = result[1:]
3434
return result
3535

36-
def arabic2roman(num):
36+
def arabic2upper_roman(num):
3737
if num == 0: return "0"
3838
breaks = [1000,900,500,400,100,90,50,40,10,9,5,4,1]
3939
numerals = ["M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I"]
@@ -46,13 +46,16 @@ def arabic2roman(num):
4646
continue
4747
return result
4848

49-
def arabic2upper_latina(num):
50-
upper_latina_numerals = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
51-
return _from_seq(upper_latina_numerals,num)
49+
def arabic2lower_roman(num):
50+
return arabic2upper_roman(num).lower()
5251

53-
def arabic2lower_latina(num):
54-
lower_latina_numerals = "abcdefghijklmnopqrstuvwxyz"
55-
return _from_seq(lower_latina_numerals,num)
52+
def arabic2upper_latin(num):
53+
upper_latin_numerals = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
54+
return _from_seq(upper_latin_numerals,num)
55+
56+
def arabic2lower_latin(num):
57+
lower_latin_numerals = "abcdefghijklmnopqrstuvwxyz"
58+
return _from_seq(lower_latin_numerals,num)
5659

5760
def arabic2upper_greek(num):
5861
upper_greek_numerals = "ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ"
@@ -62,11 +65,22 @@ def arabic2lower_greek(num):
6265
lower_greek_numerals = "αβγδεζηθικλμνξοπρστυφχψω"
6366
return _from_seq(lower_greek_numerals,num)
6467

68+
def arabic2lower_cyrillic(num):
69+
lower_cyrillic_numerals = "абвгдежзийклмнопрстуфхцчшщъыьэюя"
70+
return _from_seq(lower_cyrillic_numerals,num)
71+
72+
def arabic2upper_cyrillic(num):
73+
upper_cyrillic_numerals = "АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"
74+
return _from_seq(upper_cyrillic_numerals,num)
75+
6576
language_functions = {
6677
"zh": arabic2chinese,
67-
"roman": arabic2roman,
68-
"letter": arabic2lower_latina,
69-
"Letter": arabic2upper_latina,
70-
"gletter": arabic2lower_greek,
71-
"Gletter": arabic2upper_greek,
78+
"Roman": arabic2upper_roman,
79+
"roman": arabic2lower_roman,
80+
"latin": arabic2lower_latin,
81+
"Latin": arabic2upper_latin,
82+
"grekk": arabic2lower_greek,
83+
"Greek": arabic2upper_greek,
84+
"cyrillic": arabic2lower_cyrillic,
85+
"Cyrillic": arabic2upper_cyrillic
7286
}

src/pandoc_tex_numbering/numbering.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ def header_fields(header_nums):
1212
})
1313
return fields
1414

15-
def nums2fields(nums,item_type,num_style="plain",prefix=None,pref_space=True,parent=None):
15+
def nums2fields(nums,item_type,num_style="arabic",prefix=None,pref_space=True,parent=None):
1616
parent_num = parent.ref if not parent is None else ""
17-
if num_style == "plain":
17+
if num_style == "arabic":
1818
this_num = str(nums[-1])
1919
else:
2020
assert num_style in language_functions, f"Invalid num_style: {num_style}, must be one of {list(language_functions.keys())}"
@@ -48,7 +48,7 @@ def nums2fields(nums,item_type,num_style="plain",prefix=None,pref_space=True,par
4848
return {**common_fields,**add_fields}
4949

5050
class Formater:
51-
def __init__(self,fmt_presets,item_type,num_style="plain",prefix=None,pref_space=True):
51+
def __init__(self,fmt_presets,item_type,num_style="arabic",prefix=None,pref_space=True):
5252
self.fmt_presets = fmt_presets
5353
self.item_type = item_type
5454
self.num_style = num_style

src/pandoc_tex_numbering/pandoc_tex_numbering.py

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ def prepare(doc):
7171
"lot_title": doc.get_metadata("lot-title", "List of Tables"),
7272

7373
# Appendix Settings
74-
"apx_names": doc.get_metadata("appendix-names", "Appendix").split("\,"),
74+
"apx_names": doc.get_metadata("appendix-names", "Appendix").split("/,"),
7575

7676
# Miscellaneous
7777
"data_export_path": doc.get_metadata("data-export-path", None),
@@ -135,7 +135,7 @@ def prepare(doc):
135135
item_type=item,
136136
prefix=doc.get_metadata(f"{aka[item]}-prefix", aka[item].capitalize()),
137137
pref_space=pref_space,
138-
num_style=doc.get_metadata(f"{aka[item]}-numstyle", "plain")
138+
num_style=doc.get_metadata(f"{aka[item]}-numstyle", "arabic")
139139
)
140140

141141
for thm_type in doc.settings["theorem_names"]:
@@ -154,7 +154,7 @@ def prepare(doc):
154154
item_type=item_type,
155155
prefix=doc.get_metadata(f"theorem-{thm_type}-prefix", thm_type.capitalize()),
156156
pref_space=pref_space,
157-
num_style=doc.get_metadata(f"theorem-{thm_type}-numstyle", "plain")
157+
num_style=doc.get_metadata(f"theorem-{thm_type}-numstyle", "arabic")
158158
)
159159

160160

@@ -168,7 +168,7 @@ def prepare(doc):
168168
item_type="subfig",
169169
prefix=doc.get_metadata("subfigure-prefix", "Figure"),
170170
pref_space=pref_space,
171-
num_style=doc.get_metadata("subfigure-numstyle", "letter")
171+
num_style=doc.get_metadata("subfigure-numstyle", "latin")
172172
)
173173

174174
formaters["sec"] = []
@@ -186,9 +186,9 @@ def prepare(doc):
186186
fmt = doc.get_metadata(f"{aka[item]}-{preset}-format-{i}", default)
187187
fmt_presets[preset] = fmt
188188
if item == "apx" and i == 1:
189-
default_numstyle = "Letter"
189+
default_numstyle = "Latin"
190190
else:
191-
default_numstyle = "plain"
191+
default_numstyle = "arabic"
192192
i_th_formater = Formater(
193193
fmt_presets=fmt_presets,
194194
item_type=item,
@@ -361,7 +361,6 @@ def add_label_to_caption(num_obj,label:str,elem:Union[Figure,Table]):
361361

362362

363363
def find_labels_header(elem,doc):
364-
logger.info(f"Finding labels in header: {elem} with level {elem.level}")
365364
this_level = elem.level
366365
if this_level == 1:
367366
header_txt = to_string(elem)

0 commit comments

Comments
 (0)