pstreams-providers/.docs/content/3.in-depth/2.provider-system.md
2025-07-15 17:07:06 -06:00

231 lines
6.9 KiB
Markdown

# Provider System Overview
Understanding how the provider system works is crucial for building effective scrapers.
## The all.ts Registration System
All scrapers must be registered in `src/providers/all.ts`. This central file exports two main functions that gather all available providers:
```typescript
// src/providers/all.ts
import { Embed, Sourcerer } from '@/providers/base';
import { embedsuScraper } from './sources/embedsu';
import { turbovidScraper } from './embeds/turbovid';
export function gatherAllSources(): Array<Sourcerer> {
return [
cuevana3Scraper,
catflixScraper,
embedsuScraper, // Your source scraper goes here
// ... more sources
];
}
export function gatherAllEmbeds(): Array<Embed> {
return [
upcloudScraper,
turbovidScraper, // Your embed scraper goes here
// ... more embeds
];
}
```
**Why this matters:**
- Only registered scrapers are available to the library
- The order in these arrays doesn't matter (ranking determines priority)
- You must import your scraper and add it to the appropriate function
## Provider Types
There are two distinct types of providers in the system:
### Sources (Primary Scrapers)
**Sources** find content on websites and return either:
- **Direct video streams** (ready to play immediately)
- **Embed URLs** that need further processing by embed scrapers
**Characteristics:**
- Handle website navigation and search
- Process TMDB IDs to find content
- Can return multiple server options
- Located in `src/providers/sources/`
**Example source workflow:**
1. Receive movie/show request with TMDB ID
2. Search the target website for that content
3. Extract embed player URLs or direct streams
4. Return results for further processing
### Embeds (Secondary Scrapers)
**Embeds** extract playable video streams from embed players:
- Take URLs from sources as input
- Handle player-specific extraction and decryption
- Always return direct streams (never more embeds)
**Characteristics:**
- Focus on one player type (turbovid, mixdrop, etc.)
- Handle complex decryption/obfuscation
- Specialized for specific player technologies
- Located in `src/providers/embeds/`
**Example embed workflow:**
1. Receive embed player URL from a source
2. Fetch and parse the embed page
3. Extract/decrypt the video stream URLs
4. Return playable HLS or MP4 streams
## Ranking System
Every scraper has a **rank** that determines its priority in the execution queue:
### How Ranking Works
- **Higher numbers = Higher priority** (processed first)
- **Each rank must be unique** across all providers
- Sources and embeds have separate ranking spaces
- Failed scrapers are skipped, next rank is tried
### Rank Ranges
Usually ranks should be on 10s: 110, 120, 130...
```typescript
// Typical rank ranges (not enforced, but conventional)
Sources: 1-300
Embeds: 1-250
// Example rankings
export const embedsuScraper = makeSourcerer({
id: 'embedsu',
rank: 165, // Medium priority source
// ...
});
export const turbovidScraper = makeEmbed({
id: 'turbovid',
rank: 122, // Medium priority embed
// ...
});
```
### Choosing a Rank
**For Sources:**
- **200+**: High-quality, reliable sources (fast APIs, good uptime)
- **100-199**: Medium reliability sources (most scrapers fall here)
- **1-99**: Lower priority or experimental sources
**For Embeds:**
- **200+**: Fast, reliable embeds (direct URLs, minimal processing)
- **100-199**: Standard embeds (typical decryption/extraction)
- **1-99**: Slow or unreliable embeds (complex decryption, poor uptime)
### Finding Available Ranks
Before choosing a rank, check what's already taken:
```sh
# Search for existing ranks
grep -r "rank:" src/providers/ | sort -t: -k3 -n
```
Or check the cli to see the ranks.
::alert{type="warning"}
**Duplicate ranks will cause conflicts!** Always verify your chosen rank is unique before submitting.
::
## Provider Configuration
Each provider is configured using `makeSourcerer()` or `makeEmbed()`:
### Source Configuration
```typescript
export const mySourceScraper = makeSourcerer({
id: 'my-source', // Unique identifier (kebab-case)
name: 'My Source', // Display name (human-readable)
rank: 150, // Priority rank (must be unique)
disabled: false, // Whether scraper is disabled
flags: [], // Feature flags (see Advanced Concepts)
scrapeMovie: comboScraper, // Function for movies
scrapeShow: comboScraper, // Function for TV shows
});
```
### Embed Configuration
```typescript
export const myEmbedScraper = makeEmbed({
id: 'my-embed', // Unique identifier (kebab-case)
name: 'My Embed', // Display name (human-readable)
rank: 120, // Priority rank (must be unique)
disabled: false, // Whether scraper is disabled
async scrape(ctx) { // Single scrape function for embeds
// ... extraction logic
return { stream: [...] };
},
});
```
## How Providers Work Together
The provider system creates a powerful pipeline:
### 1. Source → Embed Chain
```
User Request → Source Scraper → Embed URLs → Embed Scraper → Video Stream → Player
```
**Pipeline Steps:**
1. **User Request** - User wants to watch content
2. **Source Scraper** - Finds content on websites
3. **Embed URLs** - Returns player URLs that need processing
4. **Embed Scraper** - Extracts streams from player URLs
5. **Video Stream** - Final playable stream
6. **Player** - User watches the content
### 2. Multiple Server Options
Sources can provide multiple backup servers:
```typescript
// Source returns multiple embed options
return {
embeds: [
{ embedId: 'turbovid', url: 'https://turbovid.com/abc' },
{ embedId: 'mixdrop', url: 'https://mixdrop.co/def' },
{ embedId: 'dood', url: 'https://dood.watch/ghi' }
]
};
```
### 3. Fallback System
If one embed fails, the system tries the next:
1. Try turbovid embed (rank 122)
2. If fails, try mixdrop embed (rank 198)
3. If fails, try dood embed (rank 173)
4. Continue until success or all options exhausted
## Best Practices
### Naming Conventions
- **IDs**: Use kebab-case (`my-scraper`, not `myMyscraper` or `My_Scraper`)
- **Names**: Use proper capitalization (`VidCloud`, not `vidcloud` or `VIDCLOUD`)
- **Files**: Match the ID (`my-scraper.ts` for ID `my-scraper`)
### Registration Order
- The order in `all.ts` arrays doesn't affect execution (rank does)
- Group similar scrapers together for maintainability
- Add imports at the top, organized logically
### Testing Integration
Always test that your registration works:
```sh
# Verify your scraper appears in the list (interactive mode shows all available)
pnpm cli
# Test your specific scraper
pnpm cli --source-id my-scraper --tmdb-id 11527
```
## Next Steps
Now that you understand the provider system:
1. Learn the details in [Building Scrapers](/in-depth/building-scrapers)
2. Study [Advanced Concepts](/in-depth/advanced-concepts) for flags and error handling
3. Look at the [Sources vs Embeds](/in-depth/sources-and-embeds) guide for more examples