-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Another type of failure I see is looks like this: 10.1086/591526+10.1088/0004-637X/706/1/L203
I'm not sure how we'd be able to tell that a "+" is not part of the DOI.
When I search for this exact string, I found this listing: http://arxiv.org/abs/0805.4758 It seems that both DOIs are associated with the same paper. One of the paper itself and another is an errata for the paper!
I'm thinking that we might get high fitness by having a special rule in the parser for splitting characters like "+&?". If we see them right before some whitespace or a new DOI_START, then stop reading the DOI.
Metadata
Metadata
Assignees
Labels
No labels