Script Valley
Regex: Actually Useful Patterns
Regex for Data Extraction and TransformationLesson 6.2

Extracting emails phone numbers and URLs from unstructured text

extract vs validate distinction, word boundary use, overlapping matches, deduplication, running multiple patterns over same text

Extraction Is Looser Than Validation

Data extraction

Validation checks if a string IS a value. Extraction finds all occurrences of a value IN a string. Extraction patterns are anchored to word boundaries, not to the full string.

const text = 'Contact sales@example.com or call 415-555-0000. See https://example.com';

const emails = text.match(/[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g);
const phones = text.match(/\b\(?\d{3}\)?[\s.\-]?\d{3}[\s.\-]?\d{4}\b/g);
const urls = text.match(/https?:\/\/[\w.\-]+(?:\/[^\s]*)?/g);

Up next

Find and replace with regex using dynamic patterns

Sign in to track progress