pstreams-providers/.docs/content/3.in-depth/1.setup-and-prerequisites.md

# Setup and Prerequisites

Before you start building scrapers, you need to set up your development environment and understand the testing workflow.

## Environment Setup

### 1. Create Environment File

Create a `.env` file in the root of the repository with the following variables:

```env
MOVIE_WEB_TMDB_API_KEY = "your_tmdb_api_key_here"
MOVIE_WEB_PROXY_URL = "https://your-proxy-url.com"  # Optional
```

**Getting a TMDB API Key:**
1. Create an account at [TheMovieDB](https://www.themoviedb.org/)
2. Go to Settings > API
3. Request an API key (choose "Developer" for free usage)
4. Use the provided key in your `.env` file

**Proxy URL (Optional):**
- Useful for testing scrapers that require proxy access
- Can help bypass geographical restrictions during development
- If not provided, the library will use default proxy services

### 2. Install Dependencies

Install all required dependencies:

```sh
pnpm install
```

## Familiarize Yourself with the CLI

The library provides a CLI tool that's essential for testing scrapers during development. Unit tests can't be made for scrapers due to their unreliable nature, so the CLI is your primary testing tool.

### Interactive Mode

The easiest way to test is using interactive mode:

```sh
pnpm cli
```

This will prompt you for:
- **Fetcher mode** (native, node-fetch, browser)
- **Scraper ID** (source or embed)
- **TMDB ID** for the content (for sources)
- **Embed URL** (for testing embeds directly)
- **Season/episode numbers** (for TV shows)

### Command Line Mode

For repeatability and automation, you can specify arguments directly:

```sh
# Get help with all available options
pnpm cli --help

# Test a movie scraper
pnpm cli --source-id catflix --tmdb-id 11527

# Test a TV show scraper (Arcane S1E1)
pnpm cli --source-id zoechip --tmdb-id 94605 --season 1 --episode 1

# Test an embed scraper directly with a URL
pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW"
```

### Common CLI Examples

```sh
# Popular test cases
pnpm cli --source-id catflix --tmdb-id 11527        # The Shining
pnpm cli --source-id embedsu --tmdb-id 129          # Spirited Away
pnpm cli --source-id vidsrc --tmdb-id 94605 --season 1 --episode 1    # Arcane S1E1

# Testing different fetcher modes
pnpm cli --fetcher native --source-id catflix --tmdb-id 11527
pnpm cli --fetcher browser --source-id catflix --tmdb-id 11527
```

### Fetcher Options

The CLI supports different fetcher modes:

- **`native`**: Uses Node.js built-in fetch (undici) - fastest
- **`node-fetch`**: Uses the node-fetch library
- **`browser`**: Starts headless Chrome for browser-like environment

::alert{type="warning"}
The browser fetcher requires running `pnpm build` first, otherwise you'll get outdated results.
::

### Understanding CLI Output

#### Source Scraper Output (Returns Embeds)
```sh
pnpm cli --source-id catflix --tmdb-id 11527
```

Example output:
```json
{
  embeds: [
    {
      embedId: 'turbovid',
      url: 'https://turbovid.eu/embed/DjncbDBEmbLW'
    }
  ]
}
```

#### Embed Scraper Output (Returns Streams)
```sh
pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW"
```

Example output:
```json
{
  stream: [
    {
      type: 'hls',
      id: 'primary',
      playlist: 'https://proxy.fifthwit.net/m3u8-proxy?url=https%3A%2F%2Fqueenselti.pro%2Fwrofm%2Fuwu.m3u8&headers=%7B%22referer%22%3A%22https%3A%2F%2Fturbovid.eu%2F%22%2C%22origin%22%3A%22https%3A%2F%2Fturbovid.eu%22%7D',
      flags: [],
      captions: []
    }
  ]
}
```

**Notice the proxied URL**: The `createM3U8ProxyUrl()` function creates URLs like `https://proxy.fifthwit.net/m3u8-proxy?url=...&headers=...` to handle protected streams. Read more about this in [Advanced Concepts](/in-depth/advanced-concepts).

#### Interactive Mode Flow
```sh
pnpm cli
```

```
✔ Select a fetcher mode · native
✔ Select a source · catflix
✔ TMDB ID · 11527
✔ Media type · movie
✓ Done!
{
  embeds: [
    {
      embedId: 'turbovid',
      url: 'https://turbovid.eu/embed/DjncbDBEmbLW'
    }
  ]
}
```

## Development Workflow

1. **Setup**: Create `.env` file and install dependencies
2. **Research**: Study the target website's structure and player technology
3. **Code**: Build your scraper following the established patterns
4. **Register**: Add to `all.ts` with unique rank
5. **Test**: Use CLI to test with multiple different movies and TV shows
6. **Iterate**: Fix issues and improve reliability
7. **Submit**: Create pull request with thorough testing documentation

## Next Steps

Once your environment is set up:

1. Read [Provider System Overview](/in-depth/provider-system) to understand how scrapers work
2. Learn [Building Scrapers](/in-depth/building-scrapers) for detailed implementation guide
3. Check [Advanced Concepts](/in-depth/advanced-concepts) for error handling and best practices

::alert{type="info"}
Always test your scrapers with multiple different movies and TV shows to ensure reliability across different content types.
::