Stack Overflow Asked by Duke Wellington on December 29, 2020
I have a long text in form of a string.
This text includes a lot of questions that are at the same time the headers of sections.
These headers always start with a number+dot+whitespace character combination and end with a question mark, I am trying to extract these strings.
This is what I’ve got so far: longString.match(/d.s+[a-zA-Z]+s\?/g)
.
Sure enough this doesn’t work.
In your example you use [a-zA-Z]+
, but you might extend that to matching 1 or more word characters using w+
This part at the end of the pattern s\?
matches an expected whitespace char followed by an optional backslash.
To match multiple words, you can optionally repeat the pattern to match a word preceded by 1 or more whitespace characters.
You one option is to use
d.s+w+(?:s+w+)*s*?
Explanation
d.
Match a single digit (for 1 or digits use d+
)s+w+
Match a .
and 1+ whitspace chars and 1+ word chars(?:s+w+)*
Optionally repeat 1+ whitspace chars and 1+ word charss*?
Match 0+ whitespace chars and a question mark.A broader match might be matching at least a single time any char except a question mark or whitespace char after the digit, dot and whitespace:
d.s+[^s?]+(?:s+[^s?]+)*?
Answered by The fourth bird on December 29, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP