-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Adding support for "role" attributes for the DocBook reader (third time's a charm) #11255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commits adds a getRoleAttr function to get the value from the role attribute of a DocBook element. This value is then added to the returned Pandoc object by using addPandocAttributes.
Elements parsed with withOptionalTitle do not automatically get the role attributes transfered to them. This covers this problem.
test/docbook-reader.native
Outdated
| , Div | ||
| ( "" | ||
| , [ "section" ] | ||
| , [ ( "role" , "sect1role" ) , ( "level" , "1" ) ] | ||
| ) | ||
| [ Header 1 ( "headers" , [] , [] ) [ Str "Headers" ] | ||
| , Div | ||
| ( "" | ||
| , [ "section" ] | ||
| , [ ( "role" , "sect2role" ) , ( "level" , "2" ) ] | ||
| ) | ||
| [ Header | ||
| 2 | ||
| ( "level-2-with-an-embedded-link" , [] , [] ) | ||
| [ Str "Level" | ||
| , Space | ||
| , Str "2" | ||
| , Space | ||
| , Str "with" | ||
| , Space | ||
| , Str "an" | ||
| , Space | ||
| , Link | ||
| ( "" , [] , [] ) | ||
| [ Str "embedded" , Space , Str "link" ] | ||
| ( "/url" , "" ) | ||
| ] | ||
| , Header | ||
| 3 | ||
| ( "level-3-with-emphasis" , [] , [] ) | ||
| [ Str "Level" | ||
| , Space | ||
| , Str "3" | ||
| , Space | ||
| , Str "with" | ||
| , Space | ||
| , Emph [ Str "emphasis" ] | ||
| ] | ||
| , Header | ||
| 4 | ||
| ( "level-4" , [] , [] ) | ||
| [ Str "Level" , Space , Str "4" ] | ||
| , Header | ||
| 5 | ||
| ( "level-5" , [] , [] ) | ||
| [ Str "Level" , Space , Str "5" ] | ||
| , Para [ Str "Hi." ] | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I commented before, I think this kind of mixing of bare Header elements and section Divs is going to cause problems.
Using addPandocAttributes on a sequence of blocks starting with a header is going to apply them to all the blocks. So, you'll need to do it more subtly. You could divide up the sequence and just apply addPandocAttributes to the first element, for example.
Another thought: since role in docbook is basically like class in HTML, should we be putting the role attributes in the spot used for classes in pandoc attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I finally understood what you meant back then. Indeed the resulting structure should be consistent on every level. I made the appropriate change to reflect that.
Regarding the role vs. class thing, I think you are right.
This comment in this issue related to the role attribute by @lifeunleaded indicated that role attributes can be used for many different things, but there is no source for that!
There would be legitimate reasons to consider role attributes as classes. For example, this is the choice made by the people from AsciiDoc:
The
roleattribute in AsciiDoc always get mapped to theclassattribute in the HTML output. In other words, role names are synonymous with HTML class names, thus allowing output elements to be identified and styled in CSS using class selectors (e.g.,sidebarblock.role1)
And in the DocBook Element Reference, we get:
roleProvides additional, user-specified classification for an element.
And in classification, there is class...
Personally, I have never used the role attribute directly, but I have always converted it to classes... I even have a Lua filter for this. (By the way, I think this filter highlights a bug in the Markdown writer that I haven't looked into yet...)
So, all in all, except if @lifeunleaded comes up with a good reason to avoid this decision, I say we do the same as the AsciiDoc folks? Would you have a nice approach to recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanntrividic A few points: I am not the one who will be approving the PR, I was merely pointing out that role in docbook differs subtly from class in e.g. HTML or Pandoc. I think your feature work is commendable and did not intend to devalue it with my comment. I'll elaborate at the end because it is very academic and I don't think it needs to block this useful feature. I can point to a few sources, but mainly to a few decades of working with applied DocBook, including in integration with Pandoc.
Elaboration (no need to read this unless you're very interested in DocBook as a standard):
While role is a common attribute in the sense that it occurs on all DocBook elements, customizers will find that it is not part of any of the “common attribute” patterns. It is parameterized differently because it is useful to be able to subclass role independently on different elements. TDG
A few select applied examples of differences:
-
DocBook tables exist in two variants: HTML and CALS. CALS tables can carry
rolebut notclassin DocBook. HTML tables can carryclassbut notrolein DocBook. -
Pandoc represents an unnumbered heading with a
classtokenunnumbered. In applied DocBook, distinctions in numbering are typically carried in thelabelattribute instead.
However, the DocBook reader does not take all variations in DocBook into account at present anyway, so I don't think this approach will make anything less useful, but quite a few things more useful, so I'm in favor of getting it in.
For those who really need to fine tune things and get other attributes than role in a class token in the Pandoc AST, they will typically have the possibility to preprocess the XML with XSLT to put things in the right place anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifeunleaded In the end I wasn't sure if you were in favor of parsing role into pandoc class attributes. When you say "I'm in favor of getting [this approach] in", do you mean treating role as class or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your feature work is commendable and did not intend to devalue it with my comment.
Oh sorry if I made you believe you said something the wrong way, you did not devalue anything, you're all right, and thank you for your feedback actually!
From what I understand, @lifeunleaded you are saying that we should indeed process roles as classes. If it is really the case, then would you recommend a specific approach for this @jgm, given where we are now? I am not sure what would be the best path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lifeunleaded In the end I wasn't sure if you were in favor of parsing role into pandoc class attributes. When you say "I'm in favor of getting [this approach] in", do you mean treating role as class or something else?
@jgm In a nutshell: I am in favor of reading role attributes in all elements in the way the PR is currently formulated (putting them into attributes with key role). I prefer it over putting them all in class, to avoid clashing with any existing assumptions on the contents of class. Sorry for being unclear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay! So, personally I am fine with both, as I believe there are legitimate arguments for both solutions. @jgm, what say you?
Elements are now discriminated before applying addPandocAttributes: if the sect function is used, then only the first element gets addPandocAttributes applied. Other Blocks still get the same results. Tests were updated to reflect this.
This PR is the direct result of what has been discussed first on #10665, and then on #10932. This proposition is basically a cleaner version based on the previous contribution attempts.
More details on what this PR cover can be found in #10665 (comment). @jgm, for reminders, you had a comment there, which I addressed but that didn't get an answer.
Hopefully, this try is the right one! (And sorry again for the messy contribution.)