Format string Into pattern
Regular expression is one of the basic skills for web crawler
1 | pattern = re.compile(r'abc') #change string into object, speed up |
MetaCharacters that need to be escaped(add \ )
. ^ $ * + ? [ ] { } \ | ( )
The special characters
1 | <!--- |
The code block below shows how word boundary works
1 | text_to_search = 'ha haha' |
Some examples
1 |
|
1 | urls = ''' |
Add a flag to ignore case:
re.compile(r’s’, re.I)