Overview
Scala URL Detector is a robust Scala library that detects and extracts URLs from unstructured text with support for multiple content formats. It is based on the fork of LinkedIn Engineering team's open-source library in the following repository.
Features
- Multiple Detection Modes: Support for HTML, XML, JSON, JavaScript, and plain text
- Smart URL Parsing: Handles URLs with or without schemes, protocol-relative URLs, and encoded characters
- Host Filtering: Allow or deny specific hosts with intelligent subdomain matching
- Format-Aware Extraction: Context-aware detection for different content types
- IPv4 & IPv6 Support: Recognizes both IPv4 and IPv6 addresses
- Type-Safe API: Uses scala-uri for strongly-typed URL representations
- Thread-Safe: Immutable data structures safe for concurrent use
- Cross-Platform: Published for Scala 2.12, 2.13, and 3.x
Installation
To use the latest release of Scala URL Detector in your project add the following to your build.sbt file:
libraryDependencies += "io.lambdaworks" %% "scurl-detector" % "1.3.0"
Quick Start
import io.lambdaworks.detection.UrlDetector
import io.lemonlabs.uri.AbsoluteUrl
// Basic usage
val detector = UrlDetector.default
val urls: Set[AbsoluteUrl] = detector.extract("Visit https://example.com")
// With specific options
import io.lambdaworks.detection.UrlDetectorOptions
val htmlDetector = UrlDetector(UrlDetectorOptions.Html)
val htmlUrls = htmlDetector.extract("<a href='https://example.com'>Link</a>")
// With host filtering
import io.lemonlabs.uri.Host
val filtered = UrlDetector.default
.withAllowed(Host.parse("example.com"))
.extract("Visit example.com and other.com")
Documentation
Getting Started
- Usage Guide - Basic usage and core concepts
- Examples - Comprehensive examples for various use cases
- Detection Options - Complete reference for all detection modes
Advanced Topics
- Advanced Usage - Advanced patterns and techniques
- API Reference - Complete API documentation
- Architecture & Design - Internal architecture and design principles
Help & Support
- FAQ & Troubleshooting - Common questions and solutions
Next Steps
- Read the Usage Guide to understand the core concepts
- Explore Examples to see the library in action
- Check out Detection Options to choose the right mode for your content
- Review the API Reference for detailed method documentation