• Skip to main content
  • Skip to primary sidebar

Technipages

Smart phone, gadget and computer tutorials

  • Topics
    • Android
    • Gaming
    • Hardware
    • Internet
    • iOS
    • MacOS
    • Office
    • Software
    • Windows
    • Definitions
  • Product Reviews
  • Downloads
  • About
iPhone: How To Use iOS Text Detection Features

iPhone: How To Use iOS Text Detection Features

Posted on August 4, 2020 by Mona Leave a Comment

Vision in iOS 11 has everything you need to create an app that can recognize text characters with implementation happening simultaneously. You don’t need a technical coding knowledge –  navigating the feature is quite simple. What’s more, the implementation is seamless.

Vision Framework

The vision framework enables you to easily implement any task that involves computer details. The structure performs face and face landmark detection, barcode recognition, image registration, general feature tracking, and text detection. Vision also allows you to use custom Core ML models for tasks like classification or object detection.

VN DetectTextRectanglesRequest

The VN DetectTextRectanglesRequest is an image analysis request that finds regions of visible text in an image; the feature returns text characters as a rectangular bounding box with origin and size.

If you are used to using swift and have been programming for a while, then you are probably wondering what the use of Vision is when there are other features like image and AVFoundation. Well, Vision is more accurate and more straightforward. The feature is also available on a variety of platforms. However, using Vision may require more processing power and processing time.

To use Vision for text detection, you will require Xcode 9 and a device that runs iOS 11.

Creating A Camera with Avcapture

First, you need to create a camera with AVCapture; this is by initializing one object AVcapturesession to perform real-time or offline capture. After that, make the session to the device connection.

To save you time from building a UI of your app, consider having a starter project, to begin with, this will give you time to focus on learning the Vision framework.

  1. Open your starter project. The views in the storyboard should be all ready and set up for you.
  2. On the ViewController.swift, look for the code section with functions and outlets.
  3. Under the outlet-ImageView, declare a session for AVcapturesession – this is used whenever you want actions performed based on a live stream.
  4. Set the AVcapturesession and the AVmediatype to video since you will perform the camera shoot to enable it to run continuously
  5. Define the output and input device
  6. The input is what the camera will see, and output is the video at a set type format of KCVPixelFormatType_32GRA.
  7. Finally, add a sublayer that contains videos to imageView and start the session. The function is known as inViewdidload. You also need to set the frame of the layer.

Call the function in the viewWillAppear method.

As the bounds are not yet finalized, override the viewDidLayoutSubviews ( ) method to update the layers bound.

After the release of iOS 10, an additional entry in Info.plist is needed, this provides a reason for using the camera. You should also set Privacy-Camera Usage Description.

Text Detection; How Vision Framework Works

There are three steps to implementing Vision on the app.

  1. Handlers – this is when you want the framework to do something after the request is called.
  2. Observations – this is what you want to do with the data supplied by you beginning with one request
  3. Requests – this is when you ask for Detect framework

Ideally, you create one text request as VNdetecttextrectanglesrequest. This is a kind of VNrequest that borders around the text. After the framework completes the application, you proceed to call the Dettexthandler function. You will also want to know the exact frame that was recognized, set it to Reportcharacterboxes=True.

After that, define the observations that contain all the results of VNdetecttextrectanglesrequest, remember to add Vision to the output camera. Since Vision exposes high-level APIs, working with it is secure.

The function checks if the Cmsamplebuffer exists and PutOut Avcaptureoutput. You should then proceed to create one variable Requestoptions as 1 Dictionary Type VNimageoption. The VNmage option is a type of structure that contains properties and data from the camera. You should then create the VNimagerequesthandler and execute the text request.

Drawing Borders Around the Text Detected

You can start by having the framework to draw two boxes, one for every letter it detects and the other for every word. Tables are the combination of all the character boxes your request will find.

  • Define the points on your view to help you position the boxes.
  • After that, create a CALayer; use VNrectangleobservation to define your constraints, making the process of outlining the box easier.

You now have all your functions laid out.

To connect your dots, begin by having your code run asynchronously. You should then check to see if a region exists within your results from your VNTextObservation.

You can now call in your function, which will draw a box within the area. Check to see if there are character boxes within the region then call in the service that brings in a box around each letter.

After that, create a variable RequestOptions. You can now create a VNImageRequestHandler object and perform the text request you created.

Finally, the last step is running your vision code with the live stream. You will need to take the video output and convert it to Cmsamplebuffer.

Additional Tips

  1. Always try to crop the image and process only the section that you need. This will reduce the processing time and memory footprint
  2. Turn on language correction when dealing with non-numeric characters then turn it off when dealing with a numeric character
  3. Include validation for recognized number strings to confirm the accuracy and eliminate showing false value to the user.
  4. The document camera controller is the best companion for text recognition since image quality plays a significant role in text recognition.
  5. Consider setting a minimum text height to increase performance.

With Vision, you have everything you need for text recognition. Since Vision is easy to use and takes a short time for implementation, using it is almost equivalent to playing with Legos. Try testing your app on different objects, fonts, lighting, and sizes. You can also impress yourself by combining Vision with Core ML.

You Might Also Like

  • iPhone 12 Pro Max Specs and FeaturesiPhone 12 Pro Max Specs and Features
  • iPhone 8 & X: How to Forward a Text MessageiPhone 8 & X: How to Forward a Text Message
  • iPhone: How to Add a Signature to Text MessagesiPhone: How to Add a Signature to Text Messages
  • iPhone 8 & X: How to Save Photos From MMS Text MessagesiPhone 8 & X: How to Save Photos From MMS Text Messages
  • iPhone: How to Configure Text Auto-ReplacementiPhone: How to Configure Text Auto-Replacement
  • Recovering Deleted Text Messages on iPhoneRecovering Deleted Text Messages on iPhone
  • iPhone 8 & X: Send Text Message to Multiple RecipientsiPhone 8 & X: Send Text Message to Multiple Recipients
  • Fix Receiving Duplicate Text Message Notifications on iPhoneFix Receiving Duplicate Text Message Notifications on iPhone
  • How To Unlock Features in iOS 13How To Unlock Features in iOS 13

Filed Under: iOS

Reader Interactions

Did this help? Let us know! Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Recent Posts

  • Can’t Connect to Google Play? Try These Useful Tips
  • How to Fix Zoom Error Code 614
  • Google Duo on Android: How To Enable Low-Light Mode
  • Troubleshooting Zoom Error 3065
  • How to Fix Microsoft Teams Error caa82ee2
  • Zoom: How to Change the Size of the Text in Chat Windows
  • Slack: How To Change Your Display Name
  • WhatsApp: How to Hide Your Profile Picture from a Specific Contact

Who’s Behind Technipages?

Baby and Daddy My name is Mitch Bartlett. I've been working in technology for over 20 years in a wide range of tech jobs from Tech Support to Software Testing. I started this site as a technical guide for myself and it has grown into what I hope is a useful reference for all.

Follow me on Twitter, or visit my personal blog.

You May Also Like

© Copyright 2021 Technipages · All Rights Reserved · Privacy