Use of Regular Expressions
Regular Expressions are a powerful syntax, used to search specific patterns in text content. They can be used in several places of OutWit Hub:
- In the bottom panel of each widget (images, links, contacts...) the Select If Contains text box allows you to select items of the list above it that contain the typed string. By starting and ending the string with the character / you can use Regular Expressions in these text boxes.
- In the Scraper Editor located in the 'Scrapers" widget, Marker Before and Marker After can be either a literal string or a Regular Expression. Format is always interpreted as a Regular Expression.
- Lastly, the 'URL to Scrape' attributed to a scraper can also be a regular expression. In this case, the scraper can be applied to any URL matching the pattern.
To use regular expressions, write your string between slashes: /myRegExp/. The pattern will be displayed in green when the syntax is correct, in red otherwise.
IMPORTANT NOTES:
The 'Format' field of the Scraper Editor is always interpreted as a regural expression, even if not marked with slashes.
For technical reasons, all regular expressions used in scrapers are interpreted as case insensitive patterns by default. [A-Z], [A-Za-z] and [a-z] have the same result. This can be changed using the #caseSensitive# directive.
Here is what you should know if you are using regular expressions: