Scraping

Glossary

The term “scraping” comes from the English language and means to scrape off or scrape together. In the field of cybersecurity, scraping refers to the process of collecting – scraping together – and storing data. This usually involves data from websites, platforms or social networks.

What does the term scraping mean in detail?

Scraping – the collection and storage of data – can basically be done in two ways:

  • Manually, i.e., by hand. With larger amounts of data, this procedure becomes very labor-intensive.
  • Automatically, for example by computer programs. Then even large amounts of data can be processed quickly.

Currently, the term scraping is mainly used for collecting data from websites. In principle, however, it can refer to all texts that are displayed on screens. In some cases, therefore, different terms are used: for example, web scraping, screen scraping or data scraping. However, what they have in common is always the collection and storage of data.

Scraping can be done for a variety of purposes:

  • for own analyses, for example for a manual competitor analysis.
  • for automatic collections and preparation of data from many different websites.
  • for collecting contact data, for example email addresses published on social media platforms.
  • for copying and unauthorized publication of content from third-party websites.

Where do I encounter scraping in my day-to-day work?

Behind every search with a search engine and every online price comparison is scraping. Search engine programs tirelessly “scrape” the addresses and information of Internet pages in order to be able to display them as search results. In the case of price comparisons, scraping is used to collect prices, images and, where applicable, product details.

Scraping is also frequently used in a professional context. For example, for competitor analysis.

However, you may also encounter the abusive side of scraping in your everyday work. For example, by:

  • a phishing email after scraping your email address published on, for example, the company website or LinkedIn.
  • a company that systematically undercuts your prices, which it scans.
  • a company that has copied text and images from your website without your consent.
  • phishing websites that have copied legitimate pages in detail by scraping, for example, a login page for online banking.

What can I do to protect myself from abusive scraping?

  • Be very conscious about sharing your data on websites and social media. This can be collected, stored, and shared via scraping.
  • Minimize the amount of data you publish that can be abused by scrapers. For example, set up contact forms on your company website instead of listing email addresses.
  • Follow the guidance in this Perseus blog post to check if data from you or your company has already been collected and published via scraping.

Related articles

  • Data Security

    Data security is ensured when the confidentiality, integrity and availability of data is ensured. The term is also used synonymously with “information security.”

    mehr lesen
  • Remote Protocols

    Remote protocols are transmission protocols in the network that enable access to remote content. These protocols are mainly used in infrastructures for IT support/helpdesk, terminal servers and virtual desktops.

    mehr lesen

Are you curious?

Test us for 30 days free of charge and without obligation.

We empower your employees to actively contribute to your company’s cybersecurity.

See for yourself how easily and quickly Perseus can be integrated into your corporate structure.

Test now for free

Do you have any questions about our services?

Do not hesitate to call: + 49 30 95 999 8080

  • Free trial period
  • Without obligation
  • Video training for cyber security and data protection with exam and certificate
  • Try our phishing simulation
  • IT security check, malware scanner, data security check and more
  • Ends automatically after 30 days