Web
Overview
Leena AI enables seamless crawling or scraping of data from public websites, allowing users to integrate relevant information directly into their knowledge base. This feature supports a variety of use cases, including:
- Support Website Integration: Easily integrate support resources from platforms like Microsoft, MS Teams, Outlook, and more.
- Company Website Integration: Leena AI can crawl its own website or similar company websites to enrich user interactions with additional information.
- Custom Public Website Integration: Flexibly integrate data from any public website, such as an insurance provider's site for company employees, based on specific requirements.
This functionality enhances the accessibility and relevance of information within the knowledge base.
Note: This functionality is subjected to the website structure. Few websites might need custom configurations for data scraping. Kindly contact Leena admin for them.
Who Can Access Web Integration in KM
This feature will be available only to the KM admins. The KM agents will not be able to crawl a website.
Implementation Steps
Creating Web Integrations
- Go to KM dashboard -> Settings Page -> Web Integrations.
- Under Web Integrations, you can either connect with an existing integration (already supported by Leena AI) or you can create a new web integration.
Existing Integrations In the landing page, you can explore the global library of web integrations already supported by Leena AI. Examples include:
https://support.microsoft.com/en-ushttps://learn.microsoft.com/en-ushttps://support.atlassian.com/https://help.zscaler.com/ziahttps://docs.aws.amazon.com/
You can click on “Get Started” to view the list of URLs crawled under the given domain. This will lead you to the settings page of the selected web integration where all the already crawled URLs will be displayed.
Adding a New Integration
-
To add a new integration, click on the “Add new integrations” button.
-
Enter the required domain (ensure it is a valid URL) and click on “crawl url”.
-
Here you can also see the global library of web integrations maintained by Leena AI under the “Existing integrations” section.
Crawler Status Once the crawling request is submitted, the crawler page will be created. Its status will be either:
- In-Progress: If the crawling process has started.
- Pending: If the crawling process is yet to be started. The pending state will automatically go into the “In-Progress” state once earlier crawling requests are processed.
Note: The crawler name, description, and logo will be automatically fetched for the given URL.
Web Integrations Settings Page
Once the crawling process for a new integration is completed or the user explores the existing integration, they will land on the settings page of that integration. This settings page will contain the following information:
- Name, Description and Logo: Details of the web integration (automatically fetched during crawling and editable).
- List of crawled URLs: Users can click on the URLs and view the fetched HTML content from this webpage.
- Sync Frequency: Users can define the frequency at which this web integration will be refreshed. In the refresh, the number of URLs and the content of each web URL will be updated.
- Create KM Articles: Option to create KM Articles from this web integration.
Creating Knowledge Articles from Web Integration
Once the crawling process is completed for a new web integration (or existing sample integrations), users have to create knowledge articles to integrate this knowledge with the virtual assistant.
-
Initiate Creation: Click on the “Create article” button.
-
Apply Filters (Optional): Users can apply filters to include or exclude selected URLs for adding to the knowledge article.
- Default behavior: All URLs are included.
- Note: Users can create multiple knowledge articles from 1 web integration based on different filters.
-
Article Details: After applying filters (if required), move to the next section, fill in the required article details, and create the article.
- Note: The Owner specified in this step will become the article owner in the KM.
-
Draft State: Once created, a knowledge article is generated in the “Draft” state in KM. It will be visible in the “Home” page and “Articles” page. This article contains the list of selected URLs.
-
Review and Publish: The article must be reviewed and published for its knowledge to be available in the virtual assistant.
-
Sync and Usage: Once published, the sync will happen with the virtual assistant. After the sync is completed, users can go to the virtual assistant and start asking queries from the webpages.
Note: The “View Source” button in the bot will take the user to the original webpage in the browser.
FAQs on Web Integrations
Q: Can a user integrate with authenticated websites? A: Currently only public knowledge websites are supported in web integrations in the UI. Authenticated websites require custom configuration. Few authenticated websites can be integrated. Contact Leena admin for integrating authenticated websites.
Q: What is the primary function of the Web Integration feature? A: It allows Leena AI to crawl and scrape data from public websites (like support sites or company websites) and integrate that information directly into the Knowledge Management (KM) base.
Q: Who has permission to add new websites for crawling? A: This feature is available only to KM admins. KM agents cannot set up web integrations.
Q: Is there a limit to how many pages can be crawled from one website? A: Yes. The crawler supports a maximum of 10,000 pages per domain.
Q: Does Leena AI already have pre-configured integrations? A: Yes, Leena AI maintains a "global library" of existing integrations, including support.microsoft.com, support.atlassian.com, docs.aws.amazon.com, and others.
Q: How do I add a new website for crawling? A:
- Go to KM Dashboard -> Settings -> Web Integrations.
- Click the "Add new integrations" button.
- Enter the valid domain URL and click "crawl url".
Q: What happens immediately after I submit a URL for crawling? A: A new crawler page is created with a status of "Pending" or "In-Progress." The system will automatically fetch the website's name, description, and logo.
Q: How will I know when the crawling is complete? A: You will be notified via dashboard and email notifications when the process starts, completes, or if an error occurs.
Q: After the crawler is finished, is the information live in the virtual assistant? A: No, not immediately. You must first create knowledge articles from the crawled content.
Q: How do I create a knowledge article from the crawled content? A:
- From the integration's settings page, click "Create article".
- (Optional) Apply filters to include or exclude specific URLs.
- Fill in the article details (like the owner).
- The article is then created in a "draft" state. You must manually review and publish this article from the KM "Home" or "Articles" page to make it available to the virtual assistant.
Disclaimer: Crawling Page Limit
Please be advised that the web integration crawler is limited to a maximum of 10,000 pages per domain.
Updated about 13 hours ago
