Frontlines DX using images and 3D data

AI technology that combines various types of data processing to address the complex issues for those in the frontlines of diverse industries

Background

In recent years, there has been an urgent need to improve the productivity of workers in the frontlines of the construction, logistics, retail, and medical industries. AI technologies such as generative AI are advancing rapidly and there are great expectations for digital transformation (DX) using these technologies.

The data required to solve the problems in the frontlines comes in various formats, such as numerical data from sensors and image, three-dimensional, and text data representing bibliographic information. Therefore it is necessary to combine these data appropriately and handle them comprehensively.

However, despite the need to handle different forms of data, frontline workers may not have the knowledge or skills to manage data processing technology or AI technology. This makes it difficult for them to come up with appropriate solutions to their own problems.

Solutions

Ricoh digitizes the frontlines by applying the optical and image processing technologies it has cultivated over many years to acquire the various types of data necessary to find solutions to whatever issues may arise. In addition, AI personnel who have the knowledge to analyze and process these data in a comprehensive manner work closely with the customers to derive a solution to the issues.

Ricoh has had much past success in the development of various AI technologies that apply image, 3D and natural language processing technologies for customers in the construction industry and other fields. Ricoh aims to utilize these technologies to develop AI for customers in a wider range of industries.

This is an image showing an example of image AI utilization to improve the efficiency of construction industry work.

Figure 1. Development results for customers in the construction industry

Technical Highlights

Digitization of frontline conditions using spherical images (360-degree images) and AI technology

The optical and image processing technologies of Ricoh make it possible to extract information that is useful for the frontlines.

  • Technology to easily photograph the entire work site (in a spherical image)
  • Technology to appropriately process distorted spherical images
  • Technology to accurately detect a target object (person, text characters, equipment, etc.) from the image
This is an example of image AI detecting people from a spherical image, with various postures, sizes, and numbers of people.

Figure 2. Example results of the recognition of people in a spherical image

Figure 3. Example results of the recognition of text characters in a spherical image

3D point cloud recognition technology that facilitates the handling of data

When three-dimensional point cloud data is acquired using devices such as laser scanners and smartphones, the very large volume of data means that it cannot be handled easily. However, the following applications are made possible by using Ricoh's original AI-powered point cloud recognition technology (technology to identify which object each point represents).

  • Automatic generation of Building Information Modeling (BIM) models
  • Judgment of the progress of construction by comparing the completed BIM with the point cloud acquired at the site during construction
  • Automatic calculation of the area and volume of a specific section of the site

Figure 4. Point cloud recognition results for an office meeting room: Items such as the table (purple) and chairs (red) are recognized as categories

Figure 5. Automatically generate BIM models by using category recognition on the point clouds acquired in the office

Knowledge transfer combining work site document analysis technology and generative AI

In recent years, increasing attention has been paid to the method of combining LLM (Large Language Models) and RAG (Retrieval-Augmented Generation), which is where responses are created while referring to the original document data of an individual company. The work performed by the workers at a site is supported by combining RAG with three-dimensional information (digital twins).

  • Troubleshooting by referring to the inspection history of each piece of equipment (When performing equipment inspection)
  • Preparation of templates for new documents by referring to past documents (When preparing reports)

On the other hand, the format of the document data at a work site is often either paper documents or image data obtained by scanning paper documents. It is not recognized as text character information when in this state, so it cannot be used for RAG.

In response to this problem, the use of Ricoh's AI-OCR technology makes it possible to extract just the information necessary from complex document data that contains various elements such as tables and figures. This extracted data is then used to generate structured data that can be used for RAG.

This is an image showing the mechanism for dialogue while referring to customer-specific document data using RAG.

Figure 6. RAG system to support work sites

Ricoh's Vision

Ricoh will help customers transform the way they work in frontlines by combining AI technology with the optical, software, and image processing technologies that the Company has cultivated over many years in the development of office equipment.

In addition, AI personnel with extensive development experience will work side by side with the customer to solve customer-specific issues.