Skip to content

Docop configuration

Configuration is given in the config.yaml file. Alternatively, another file may be given by using the docop --config option.

Directories

The directories section specifies where docop will look up for pipelines (specified as .yaml files), Python task modules (.py) and documents that have been created by tasks.

Directories are relative to the current directory where docop is being run from.

dirs:
  pipes: pipes
  tasks: tasks
  content: content
  configs: configs

Sources to fetch

sources:
  wikipedia:
    resources:
      - https://en.wikipedia.org/wiki/Document_automation

  eff:
    title: Electronic Frontier Foundation
    resources:
      - https://www.eff.org/deeplinks/2023/11/debunking-myth-anonymous-data
      - https://www.eff.org/deeplinks/2023/11/publics-right-fight-bad-patents-must-be-protected

  foei:
    title: Friends of the Earth
    resources:
      - https://www.foei.org/press-release-nature-of-business-report/
      - https://www.foei.org/reaction-ipcc-synthesis/

Export targets

Beyond mandating a targets section containing names of export targets, docop does not have any conventions for specifying target information. Any extra content can be given, including data used for authentication.

targets:
  mysite:
    title: my target site
    domain: mysite
    account: mysystem

Authentication

Often, when retrieving, processing or exporting content, some third party system needs to be authenticated with.

Authentication accounts with arbitrary information can be added in configuration files in the accounts section:

accounts:
  mysystem:
    title: my system account
    apikey: none

The account to use can then be set for each source or target; see the export targets setting example above. For a general processing task, account can be given using the --account command-line option. The option can also be used to override the specified account.