mirror of
https://github.com/p-stream/providers.git
synced 2026-05-14 23:11:09 +00:00
179 lines
4.9 KiB
Markdown
179 lines
4.9 KiB
Markdown
# Setup and Prerequisites
|
|
|
|
Before you start building scrapers, you need to set up your development environment and understand the testing workflow.
|
|
|
|
## Environment Setup
|
|
|
|
### 1. Create Environment File
|
|
|
|
Create a `.env` file in the root of the repository with the following variables:
|
|
|
|
```env
|
|
MOVIE_WEB_TMDB_API_KEY = "your_tmdb_api_key_here"
|
|
MOVIE_WEB_PROXY_URL = "https://your-proxy-url.com" # Optional
|
|
```
|
|
|
|
**Getting a TMDB API Key:**
|
|
1. Create an account at [TheMovieDB](https://www.themoviedb.org/)
|
|
2. Go to Settings > API
|
|
3. Request an API key (choose "Developer" for free usage)
|
|
4. Use the provided key in your `.env` file
|
|
|
|
**Proxy URL (Optional):**
|
|
- Useful for testing scrapers that require proxy access
|
|
- Can help bypass geographical restrictions during development
|
|
- If not provided, the library will use default proxy services
|
|
|
|
### 2. Install Dependencies
|
|
|
|
Install all required dependencies:
|
|
|
|
```sh
|
|
pnpm install
|
|
```
|
|
|
|
## Familiarize Yourself with the CLI
|
|
|
|
The library provides a CLI tool that's essential for testing scrapers during development. Unit tests can't be made for scrapers due to their unreliable nature, so the CLI is your primary testing tool.
|
|
|
|
### Interactive Mode
|
|
|
|
The easiest way to test is using interactive mode:
|
|
|
|
```sh
|
|
pnpm cli
|
|
```
|
|
|
|
This will prompt you for:
|
|
- **Fetcher mode** (native, node-fetch, browser)
|
|
- **Scraper ID** (source or embed)
|
|
- **TMDB ID** for the content (for sources)
|
|
- **Embed URL** (for testing embeds directly)
|
|
- **Season/episode numbers** (for TV shows)
|
|
|
|
### Command Line Mode
|
|
|
|
For repeatability and automation, you can specify arguments directly:
|
|
|
|
```sh
|
|
# Get help with all available options
|
|
pnpm cli --help
|
|
|
|
# Test a movie scraper
|
|
pnpm cli --source-id catflix --tmdb-id 11527
|
|
|
|
# Test a TV show scraper (Arcane S1E1)
|
|
pnpm cli --source-id zoechip --tmdb-id 94605 --season 1 --episode 1
|
|
|
|
# Test an embed scraper directly with a URL
|
|
pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW"
|
|
```
|
|
|
|
### Common CLI Examples
|
|
|
|
```sh
|
|
# Popular test cases
|
|
pnpm cli --source-id catflix --tmdb-id 11527 # The Shining
|
|
pnpm cli --source-id embedsu --tmdb-id 129 # Spirited Away
|
|
pnpm cli --source-id vidsrc --tmdb-id 94605 --season 1 --episode 1 # Arcane S1E1
|
|
|
|
# Testing different fetcher modes
|
|
pnpm cli --fetcher native --source-id catflix --tmdb-id 11527
|
|
pnpm cli --fetcher browser --source-id catflix --tmdb-id 11527
|
|
```
|
|
|
|
### Fetcher Options
|
|
|
|
The CLI supports different fetcher modes:
|
|
|
|
- **`native`**: Uses Node.js built-in fetch (undici) - fastest
|
|
- **`node-fetch`**: Uses the node-fetch library
|
|
- **`browser`**: Starts headless Chrome for browser-like environment
|
|
|
|
::alert{type="warning"}
|
|
The browser fetcher requires running `pnpm build` first, otherwise you'll get outdated results.
|
|
::
|
|
|
|
### Understanding CLI Output
|
|
|
|
#### Source Scraper Output (Returns Embeds)
|
|
```sh
|
|
pnpm cli --source-id catflix --tmdb-id 11527
|
|
```
|
|
|
|
Example output:
|
|
```json
|
|
{
|
|
embeds: [
|
|
{
|
|
embedId: 'turbovid',
|
|
url: 'https://turbovid.eu/embed/DjncbDBEmbLW'
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
#### Embed Scraper Output (Returns Streams)
|
|
```sh
|
|
pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW"
|
|
```
|
|
|
|
Example output:
|
|
```json
|
|
{
|
|
stream: [
|
|
{
|
|
type: 'hls',
|
|
id: 'primary',
|
|
playlist: 'https://proxy.fifthwit.net/m3u8-proxy?url=https%3A%2F%2Fqueenselti.pro%2Fwrofm%2Fuwu.m3u8&headers=%7B%22referer%22%3A%22https%3A%2F%2Fturbovid.eu%2F%22%2C%22origin%22%3A%22https%3A%2F%2Fturbovid.eu%22%7D',
|
|
flags: [],
|
|
captions: []
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Notice the proxied URL**: The `createM3U8ProxyUrl()` function creates URLs like `https://proxy.fifthwit.net/m3u8-proxy?url=...&headers=...` to handle protected streams. Read more about this in [Advanced Concepts](/in-depth/advanced-concepts).
|
|
|
|
#### Interactive Mode Flow
|
|
```sh
|
|
pnpm cli
|
|
```
|
|
|
|
```
|
|
✔ Select a fetcher mode · native
|
|
✔ Select a source · catflix
|
|
✔ TMDB ID · 11527
|
|
✔ Media type · movie
|
|
✓ Done!
|
|
{
|
|
embeds: [
|
|
{
|
|
embedId: 'turbovid',
|
|
url: 'https://turbovid.eu/embed/DjncbDBEmbLW'
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
## Development Workflow
|
|
|
|
1. **Setup**: Create `.env` file and install dependencies
|
|
2. **Research**: Study the target website's structure and player technology
|
|
3. **Code**: Build your scraper following the established patterns
|
|
4. **Register**: Add to `all.ts` with unique rank
|
|
5. **Test**: Use CLI to test with multiple different movies and TV shows
|
|
6. **Iterate**: Fix issues and improve reliability
|
|
7. **Submit**: Create pull request with thorough testing documentation
|
|
|
|
## Next Steps
|
|
|
|
Once your environment is set up:
|
|
|
|
1. Read [Provider System Overview](/in-depth/provider-system) to understand how scrapers work
|
|
2. Learn [Building Scrapers](/in-depth/building-scrapers) for detailed implementation guide
|
|
3. Check [Advanced Concepts](/in-depth/advanced-concepts) for error handling and best practices
|
|
|
|
::alert{type="info"}
|
|
Always test your scrapers with multiple different movies and TV shows to ensure reliability across different content types.
|
|
::
|