{"id":36008,"date":"2021-12-06T14:13:32","date_gmt":"2021-12-06T14:13:32","guid":{"rendered":"https:\/\/ubuntuhandbook.org\/?p=36008"},"modified":"2025-07-22T11:30:57","modified_gmt":"2025-07-22T11:30:57","slug":"install-tesseract-ocr-5-ubuntu","status":"publish","type":"post","link":"https:\/\/ubuntuhandbook.org\/index.php\/2021\/12\/install-tesseract-ocr-5-ubuntu\/","title":{"rendered":"How to Install Latest Tesseract OCR 5 in Ubuntu 24.04 | 22.04"},"content":{"rendered":"<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-thumbnail wp-image-36009\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon-250x250.webp\" alt=\"\" width=\"250\" height=\"250\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon-250x250.webp 250w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon-300x300.webp 300w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon-600x600.webp 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon-768x768.webp 768w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-icon.webp 1200w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/><\/a><\/p>\n<p>This simple tutorial shows how to install the latest Tesseract OCR engine in all current Ubuntu releases (Ubuntu 24.04, Ubuntu 22.04, and Ubuntu 20.04) via PPA.<\/p>\n<p><a href=\"https:\/\/github.com\/tesseract-ocr\/tesseract\" target=\"_blank\" rel=\"noopener\">Tesseract<\/a> is the most accurate open-source OCR engine that reads a wide variety of image formats and converts them to text in over 40 languages. Tesseract 5.0.0 was officially released a few days ago that features:<\/p>\n<ul>\n<li>Faster training and OCR performance while less memory usage via &#8216;fast bloats&#8217;.<\/li>\n<li>Support for latest macOS and Apple Silicon<\/li>\n<li>Better ARM\/ARM64 support.<\/li>\n<li>API improvements and more.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-36012\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr-600x322.jpg\" alt=\"\" width=\"600\" height=\"322\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr-600x322.jpg 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr-300x161.jpg 300w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr-768x412.jpg 768w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-ocr.jpg 1276w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h3>How to Install Tesseract OCR in Ubuntu:<\/h3>\n<p>The optical character recognition engine is available in Ubuntu repositories though it&#8217;s always old.<\/p>\n<p>Thanks to Alexander Pozdnyakov, the maintainer of Tesseract OCR in <b>Debian\/Ubuntu<\/b> official repository, also maintains few PPAs with the latest packages. And, most CPU architectures (<code>amd64<\/code>, <code>i386<\/code>, <code>arm64<\/code>\/<code>armhf<\/code>, <code>ppc64el<\/code>, <code>s390x<\/code>) are supported.<\/p>\n<h4>Option 1: Add Tesseract 4.x PPA<\/h4>\n<p>For the latest release of <b>Tesseract OCR 4<\/b> (v4.1.3 so far), the <a href=\"https:\/\/launchpad.net\/~alex-p\/+archive\/ubuntu\/tesseract-ocr\" target=\"_blank\" rel=\"noopener\">stable PPA<\/a> contains the packages for Ubuntu <b>18.04<\/b>, Ubuntu <b>20.04<\/b>, Ubuntu <b>21.10<\/b>, and old Ubuntu <b>16.04<\/b>\/<b>14.04<\/b>.<\/p>\n<p>Press <b>Ctrl+Alt+T<\/b> on keyboard to open terminal. When it opens, run the command below to add the PPA:<\/p>\n<pre>sudo add-apt-repository ppa:alex-p\/tesseract-ocr<\/pre>\n<p><i>Type user password when it asks (no visual feedback) and hit Enter to continue.<\/i><\/p>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-4-ppa.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-36013\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-4-ppa.png\" alt=\"\" width=\"600\" height=\"246\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-4-ppa.png 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/tesseract-4-ppa-300x123.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h4>Option 2: Add Tesseract 5 PPA<\/h4>\n<p>The new 5.x release series (5.4.1 so far) is available in the <a href=\"https:\/\/launchpad.net\/~alex-p\/+archive\/ubuntu\/tesseract-ocr5\" target=\"_blank\" rel=\"noopener\">another PPA<\/a> for Ubuntu 25.04, Ubuntu <b>24.04<\/b>, Ubuntu <b>22.04<\/b>, and Ubuntu <b>20.04<\/b>.<\/p>\n<p>Also, press Ctrl+Alt+T to open terminal and run command:<\/p>\n<pre>sudo add-apt-repository ppa:alex-p\/tesseract-ocr5<\/pre>\n<p><b>NOTE:<\/b> install the OCR from this PPA will override the old 4.x packages, though it&#8217;s not 100 % API compatible with v4.0.<\/p>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/add-tesseractocr5-ppa.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-37397\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/add-tesseractocr5-ppa.png\" alt=\"\" width=\"600\" height=\"235\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/add-tesseractocr5-ppa.png 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/add-tesseractocr5-ppa-300x118.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h4>Option 3: Add Tesseract repository for Debian:<\/h4>\n<p>For <b>Debian Stretch<\/b>, <b>Buster<\/b>, <b>Bullseye<\/b>, and <b>Sid<\/b>, there&#8217;s apt repositories for both Tesseract v4 and v5. User may follow the link button below to add the repository:<\/p>\n<div class=\"wp-block-buttons aligncenter\">\n<div class=\"wp-block-button is-style-fill\"><a class=\"wp-block-button__link has-vivid-cyan-blue-to-vivid-purple-gradient-background has-text-color has-background\" href=\"https:\/\/notesalexp.org\/tesseract-ocr\/#tesseract_5.x\" target=\"_blank\" rel=\"noreferrer noopener\">Tesseract repository for Debian<\/a><\/div>\n<\/div>\n<h4>Update and Install Tesseract:<\/h4>\n<p>After adding a PPA or repository from the previous options, run command in terminal to refresh system package cache in case you&#8217;re still running old Ubuntu 18.04 and earlier:<\/p>\n<pre>sudo apt update<\/pre>\n<p>And, finally install the software engine via command:<\/p>\n<pre>sudo apt install tesseract-ocr<\/pre>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/apt-tesseractocr.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-36015\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/apt-tesseractocr.png\" alt=\"\" width=\"600\" height=\"277\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/apt-tesseractocr.png 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/apt-tesseractocr-300x139.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<p>Or, upgrade the package using Software Updater:<\/p>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-36016\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr-600x424.webp\" alt=\"\" width=\"600\" height=\"424\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr-600x424.webp 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr-300x212.webp 300w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr-768x543.webp 768w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/update-tesseract-ocr.webp 786w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<h3>How to Remove PPAs &amp; uninstall Tesseract OCR:<\/h3>\n<p>To remove the PPAs, either run previous <code>add-apt-repository<\/code> command with <code>--remove<\/code> flag, or use <b>Software &amp; Updates<\/b> utility under &#8216;Other Software&#8217; tab.<\/p>\n<p><a href=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-36017\" src=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa-600x290.webp\" alt=\"\" width=\"600\" height=\"290\" srcset=\"https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa-600x290.webp 600w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa-300x145.webp 300w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa-768x371.webp 768w, https:\/\/ubuntuhandbook.org\/wp-content\/uploads\/2021\/12\/remove-tesseract-ppa.webp 1012w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><\/p>\n<p>To remove OCR engine, use command:<\/p>\n<pre>sudo apt remove --autoremove tesseract-ocr tesseract-ocr-*<\/pre>\n<p>You may also remove the <code>libtesseract*<\/code> package, which will however remove other app packages (e.g., gImageReader) that depends on it.<\/p>","protected":false},"excerpt":{"rendered":"<p>This simple tutorial shows how to install the latest Tesseract OCR engine in all current Ubuntu releases (Ubuntu 24.04, Ubuntu 22.04, and Ubuntu 20.04) via PPA. Tesseract is the most accurate open-source OCR engine that reads a wide variety of image formats and converts them to text in over 40 languages. Tesseract 5.0.0 was officially [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":36009,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[2058],"class_list":["post-36008","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-howtos","tag-tesseract-ocr"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/posts\/36008","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/comments?post=36008"}],"version-history":[{"count":0,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/posts\/36008\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/media\/36009"}],"wp:attachment":[{"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/media?parent=36008"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/categories?post=36008"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ubuntuhandbook.org\/index.php\/wp-json\/wp\/v2\/tags?post=36008"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}