Building company data enrichment API

Okay, let's cut to the chase - in the dynamic realm of B2B sales, having precise and actionable company data at your fingertips isn't just a convenience, it's a substantial edge. Today, I am going to unpack the nuances of my latest project, ProspectLens.net, a bootstrapped SaaS designed to simplify your B2B sales strategies through an effective company data enrichment API. In this post, I will highlight how ProspectLens can be a potent tool in your inbound and outbound sales toolkit.

Table of Contents

Why? Building an affordable Crunchbase API alternative

The journey began when several fellow B2B SaaS founders expressed a common desire: to seamlessly enrich their signup processes with pertinent company data. They were frustrated with the inaccessibility of the Crunchbase API, which seemed only available to Fortune 500 behemoths with deep pockets.

Recognizing this gap, I saw an opportunity to swiftly offer significant value using the powerful ScrapeNinja web scraping engine as a backbone. Imagine a prospect initiating an inbound sales process by creating an account on a SaaS CRM with just a business email. This is where the ProspectLens company enrichment API steps in, converting a simple email into a rich source of essential business insights. Just call the ProspectLens API like this: /lookup?domain=example.com and get a JSON with a wealth of comprehensive information about the corresponding company, with aggregated data commonly found on platforms such as Crunchbase and Semrush, including funding data, website traffic analysis, founder profiles, social media links, and more.

My first customer

My first client was a SaaS owner in the cyber security niche who is offering his product to larger companies, such as financial organisations. He wanted to flag the best contacts from his user base and personalize his outreach to these existing customers. After we completed a test run and ironed out a few issues with ProspectLens process, the success rate (where "success" means that the domain name from the contact email address could be found and correctly matched by ProspectLens /lookup endpoint - so the proper JSON with corresponding company data was aggregated and returned) for his contact base was close to 65% which was an acceptable result. But, it got me thinking, what if there is not company data on Crunchbase?

Enriching data with Linkedin

I realized that Linkedin is another great source of business data - I can grab followers count and employee count from Linkedin page of a company, and there are much more  companies on Linkedin compared to Crunchbase. It was important to avoid scraping personal profiles from Linkedin to avoid potential GDPR issues.

I also started to scrape metadata from the company domain itself, to see if it redirects anywhere and if there is any website there.

As a result, /lookup-all was born as a more advanced endoint with much higher coverage percentage (I would say, 90% success rate on average). It launches concurrent requests trying to research the company by domain name, and merges results from multiple web sources - and, is resilient enough to failures.

API-First: designed to be integrated

1. CRM integration

At its core, ProspectLens takes an API-first approach as it is designed to integrate seamlessly with your CRM of choice: Hubspot, Pipedrive are great candidates. It's not just about retrieving data, it's about making that data work for you in the simplest way possible, enhancing your CRM's capabilities and seamlessly optimising and personalizing your inbound and outbound sales strategies so it works best for this particular prospect.

2. Demo dashboard

As the target audience (sales teams) may not be extremely technical and might prefer to have a demo of a product before investing resources into integration process, I have implemented a demo dashboard that features a multi-threaded API launcher where you can input a list of domain names (typically extracted from business email addresses) and get a CSV with corresponding company data back.

3. Scoring and qualifying prospects, in Google Sheets

I love Google Sheets. It's a perfect database for quick demos. I also love Make.com which is a great Zapier alternative for automating things. These two are both versatile and powerful tools and make a perfect duo. So, for some customers of ProspectLens, I have also implemented a quick Make.com no-code pipeline which is useful if you want to populate a Google Sheet which already contains an email list, - with company fields like company name, funding round totals, location, and LinkedIn URL:

The scenario is really straightforward: it iterates through Google Sheet column containing domain names (which were extracted from another column with business emails, by Google Sheet formula: =MID(A2,FIND("@",A2)+1,LEN(A2)-FIND("@",A2))) and calls ProspectLens /lookup API endpoint trying to enrich this domain with company data. The scenario also filters out @gmail and @outlook domains to avoid useless calls to the API.

I will be happy to share the Make.com scenario, just drop me a message: contact@prospectlens.net

Getting Started

So, essentially, there are 3 major ways to start using ProspectLens:

  1. Integrate it into CRM of your choice: HubSpot, Pipedrive, custom built
  2. Leverage Google Sheets and Make.com scenario to perform API calls
  3. Use API runner Github repo (you still need to know what is Docker to launch it)

Which one to use? My recommendation is to start with #2 (Google Sheets) and do some test runs on 50-100 emails, to understand the yield. ProspectLens is not perfect: it can't enrich a lead email which uses @gmail.com address. For a business email, the success rate is usually around 50-70% - it depends on business vertical.

To start using ProspectLens, you need to subscribe to the API on APIRoad marketplace: https://apiroad.net/marketplace/apis/prospectlens - and retrieve an API key. It is also available on RapidAPI marketplace, as well: https://rapidapi.com/restyler/api/website-intelligence

It might look a bit unusual, but ProspectLens website does not have its own signup form. As it is an API product, all the subscription, rate limiting, and account management is offloaded to an external marketplace, which allows to access a great API usage analytics and great documentation.

Copy&paste your ProspectLens API key and API endpoint into Make.com, or API Runner, and try to play with it!

API endpoints overview

ProspectLens stands as a simple, subscription-based tool, designed to retrieve crucial company data through website domain names without any fuss. Let's delve deeper into what makes ProspectLens a valuable addition to your toolkit:

The /lookup endpoint: Data retrieval made easy

The /lookup GET endpoint serves as your primary resource for obtaining a wealth of data akin to what you would find on platforms like Crunchbase. Here are some examples of the data that can be retrieved, providing a clearer lens to view potential prospects:

Company Description

  • Get data.properties.short_description
  • Example value: "SaaS company involved in fintech"
  • Get data.properties.title
  • Example value: "Microsoft"

Funding Rounds Summary Total

  • Get .data.info.funding_rounds_summary.funding_total.value_usd (integer)
  • Example value: "1000000" - which means 1M total funding.

Website Traffic Estimate (via SEMrush)

  • Get .data.info.semrush_summary.semrush_visits_latest_month
  • Example value: "341427" - indicating a traffic of 341k visitors per month.
  • LinkedIn
  • Get .data.info.social_fields.linkedin.value (Optional)
  • Example value: "http://www.linkedin.com/company/microsoft"
  • Twitter
  • Get .data.info.social_fields.twitter.value (Optional)
  • Example value: "http://twitter.com/@Microsoft"

Smart company name matching by relevancy

An important feature of ProspectLens is that it is smart and does not require an exact match of a domain name. Which means: if your prospect has signed up with bob@ibm.co.uk , there is a high chance that ibm.com might be the ancestor entity and this is what /lookup will try to extract.

The Backup Strategy: /lookup-google

There might be occasions where the /lookup endpoint may fall short. To counter this, the /lookup-google GET endpoint acts as a robust backup, retrieving SERP results pertaining to the website domain name, thus ensuring a comprehensive data search, every time.

The /lookup-all endpoint: Merging multple sources

/lookup-all endpoint merges data sources into big JSON with a wealth of company data.

Here is a sample JSON response of /lookup-all endpoint, containing data about Make.com company, it contains:

{
  "data": {
    "cb": {
      "properties": {
        "identifier": {
          "uuid": "2fb09001-17f5-48a6-9544-0c9db01a65ee",
          "value": "Make",
          "image_id": "q5ifhvzufwm61x4wkrth",
          "permalink": "make-65ee",
          "entity_def_id": "organization"
        },
        "title": "Make",
        "short_description": "Design, build and automate anything at the speed of your ideas."
      },
      "info": {
        "semrush_summary": {
          "semrush_global_rank": 30316,
          "semrush_visits_latest_month": 2726878
        },
        "current_advisors_image_list": 
    ....
   },
"linkedin": {
      "type": "organization",
      "metaDescription": "Make | 29,618 followers on LinkedIn. Design, build and automate anything at the speed of your ideas. | Our vision is a world where everyone has the power to innovate without limits. Make is the leading visual platform for anyone to design, build, and automate anything - from tasks and workflows to apps and systems - without coding. Make enables individuals, teams, and enterprises across all verticals to create powerful custom solutions that scale their businesses faster than ever.",
      "followerNum": 29618,
      "employeeNum": 334,
      "logoUrl": "https://media.licdn.com/dms/image/C4E0BAQG-Ky4v1uZOPQ/company-logo_200_200/0/1645456009981?e=2147483647&v=beta&t=f-YcrbZSxISke1WsEZ49Urzv0pJctmu6lWCl-al-KBY",
      "name": "Make",
      "url": "https://cz.linkedin.com/company/itsmakehq",
      "address": {
        "type": "PostalAddress",
        "streetAddress": "Menclova 2538/2",
        "addressLocality": "Prague",
        "addressRegion": "Praha 8",
        "postalCode": "18000",
        "addressCountry": "CZ"
      },
      "description": "Our vision is a world where everyone has the power to innovate without limits. Make is the leading visual platform for anyone to design, build, and automate anything - from tasks and workflows to apps and systems - without coding. Make enables individuals, teams, and enterprises across all verticals to create powerful custom solutions that scale their businesses faster than ever. Make powers over 500,000+ organizations around the globe. ",
      "slogan": "Design, build and automate anything at the speed of your ideas.",
      "sameAs": "https://make.com"
    },
    "metadata": {
      "info": {
        "version": "2",
        "statusCode": 200,
        "statusMessage": "",
        "finalUrl": "https://www.make.com/en",
        "publicEmailDomain": false
      },
      "extracted": {
        "meta": {
          "title": "Make | Automation Software | Connect Apps & Design Workflows",
          "description": "Automate your work. Make allows you to visually create, build and automate workflows. User friendly no-code integration tool. Try it now for free!",
          "favicon": "https://www.make.com/en/en/favicon-32x32.png",
          "ogImage": "https://images.ctfassets.net/qqlj6g4ee76j/3JMOVllWFLEYAExIuNexJ5/0c5e245d7212236b5ce245e03cc5bdc6/OG-Image_.png"
        }
      }
    }
  },
}

So, in .data property, there are 3 sources: .data.cb, .data.linkedin, and .data.metadata.

Backed by ScrapeNinja.net technology

Let's dive deeper into the engine propelling ProspectLens - ScrapeNinja.net. I have started ScrapeNinja in 2021 in attempt to simplify web scraping tasks for myself and fellow software engineers, and it got its traction: I got Product Hunt award in 2023, and it is now used by hundreds of customers for data extraction pipelines. ScrapeNinja technology acts as the backbone, adeptly managing data retrieval and cleanup from a myriad of sources, consequently offering neat, organized, and reliable data.

Ever since I ventured into the field of web scraping, I always wanted to create something that could make the benefits of data extraction accessible to everyone, not just those well-versed in the technicalities of the domain. That's where the idea for ProspectLens blossomed from. I envisioned it as a bridge, connecting individuals like sales folks, SaaS entrepreneurs, and heads of growth, who may not have deep insights into HTTP protocols or APIs, to the sophisticated world of web scraping.

Through ProspectLens, I've tried to simplify the otherwise complex process of data extraction, making it almost effortless for a wide range of professionals to tap into ScrapeNinja's innovative technology. It's not just about fostering growth; it's about empowering people to make informed decisions by harnessing the power of data, without getting bogged down by the technical complexities. It's my earnest attempt to bring a piece of the technological marvel to everyone's desk, making data-driven strategies not a privilege but a norm in various business sectors.

Read more about underlying web scraping technologies of ScrapeNinja

Conclusion: the tool to enhance your B2B sales strategies with enriched leads data

To wrap this up, ProspectLens stands as a practical and effective ally in reinforcing your B2B sales strategy. Whether you're focusing on inbound or outbound sales, the depth of data available through ProspectLens can be a strategic advantage, making your sales strategies smarter and more informed, without adding complexity to the process.

Please try out ProspectLens and let me know if it turns out to be useful for you (or not). We are actively conducting customer interviews and working on adding more data sources to enhance the API's usefulness even further.