Agenda of the training session
What is the crawl depth of the website crawler?
In practice, the crawl depth is unlimited. The crawl depth is 4 starting from the starting URL chosen in the source configuration. However, every second crawl session starts from a page already created during a previous crawl session, chosen randomly. Iteratively, the Cikisi crawler will always go deeper into the site.
Can I choose the depth of the crawl?
No, because we have opted for a more precise limitation, based on the structure that the URL of the article to be created must have. So you can ask to collect only articles with /en/news in their URL.