The next library we are going to look at is called Kraken. Which was developed by the University of PSL in Paris, it's actually based on a slightly older code base Okrapus. And you could see how the flexible open source licences allow new ideas to grow by building upon older ideas. And in this case, I fully support the idea that the Kraken, a mythical massive sea creature, is the natural progression of an octopus. What we're going to use Kraken for is to detect lines of text as bounding boxes in a given image. The biggest limitation of Tesseract is the lack of a layout engine inside of it. Tesseract expects to be using fairly clean text, it gets confused if we don't crop out other artifacts. It's not bad, but Kraken can help us by segmenting pages, let's take a look. First we'll take a look at the Kraken module itself, so import Kraken and let's run help on Kraken. So there isn't much of a discussion here, but there are a number of sub-modules that look interesting. I spent a bit of time on their website and I think the pageseg module, which handles all of the page segmentation, is the one we want to use, let's look at it. So from Kraken, we'll import pageseg and then help on pageseg. So it looks like there are a few different functions that we can call and the segment function looks particularly appropriate. I love how expressive this library is on the documentation front, I can see immediately that we're working with pill.image files. And the author is even indicated that we need to pass in either a binarised example one, or grayscale example L for luminance image. We can also see that the return value is a dictionary object with two keys, text direction, which will return to us a string of the direction of text. And boxes, which appears to be a list of tuples, where each tuple is a box in the original image, let's try this on the image of text. I have a simple bit of text in a file called two_col.png which is from a newspaper on campus here. So from pill will import images normally, and then image.open in read only /tocall.ping. Let's display the image in line, so we'll just call it display with I am, and let's now convert it to black and white, and segment it up into lines with Kraken. So for this we'll make some new variable bounding_boxes = pageseg.segment. And then im.convert, and we'll binarize that, sub boxes. And let's print those lines to the screen, so I'll just print bounding_boxes. All right, so we se e the image here, and we see the bounding boxes. Okay, so pretty simple, two column text and then a list of lists, which are the bounding boxes of the lines of that text. Let's write a little routine to try and see the effects a bit more clearly, I'm going to clean up my act a bit and write real documentation too, it's good practice. So def show boxes, we'll call it, and we'll take in a parameter image. So the docs, I say, modifies the past image to show a series of bounding boxes on an image as run by Kraken. Our parameter image is PIL.Image object, that makes it easier for other people to use this function. And our return is also going to be an image, the modified PIL.Image object. Okay, let's bring in our ImageDraw object first, so from PIL import ImageDraw. And this was covered in our earlier lecturers, you can go back if you're interested. And let's grab a drawing object to annotate that image. So we'll create a new variable, drawing object, imagedraw.draw and we'll pass in the image that we want to be able to draw in. We can the create a set of boxes using the page seg.segment. So bounding boxes = pageseg.segment, we'll convert our image. Remember we have to binarise our luminance as sub boxes, and now, let's go through that list of bounding boxes. So for box and boxes, we're just going to draw a nice rectangle. So drawing_object.rectangle, we'll give the box we're interested in, we'll set the fill to none in the outline to red. And I'll make it easy, we're just going to return that image object, so return img. To test this, let's use display, so here, display, then show box is then we'll read in the image, image.open. We could, of course, reuse the image, but this is good practice when you're using Jupiter notebooks. All right, so we see our image here with a bunch of red boxes. So not bad at all, it's interesting to see that Kraken isn't completely sure what to do with this two column format. In some cases Kraken has identified a line in just a single column. While in other cases Kraken expand the line marker all the way across the page. This is matter, well it really depends on that goal, in this case I want to see if we can improve on this a little bit. So we're going to go a bit of script here, while this week of lectures is about libraries. The goal of this last goal is actually to give you confidence that you can apply your knowledge to actual programming tasks. Even if the library you're using doesn't quite do what you want. I'd like to pause the video for a moment and collect your thoughts. Looking at the image above with the two column example and red boxes. How do you think we might modify this image to improve cracklings ability to detect lines? So thanks for sharing your thoughts, I'm looking forward to seeing the breadth of ideas that everyone on the course comes up with. Here's my partial solution, when looking through the Kraken docs on the page seg function, I saw that there were a few parameters we can supply in order to improve the segmentation. One of these is the black call seps parameter, if set to true, Kraken will assume that the columns will be separated by black lines. This isn't our case here, but I think we can have all of the tools that we need to go through and actually change the source image, to have a black separator between columns. So the first step is what I want to update the show boxes function. I'm going to just do a quick copy and past from above but adding the black call sep = true parameter. Okay, the next step is to think of the algorithms that we want to apply to detect a white column separator. In experimenting a bit, I decided that I only wanted to add the separator if the space was at least 25 pixels wide, which is roughly the width of a character, and six lines high. The width is easy, let's just make a variable, so char_width = 25. The height is harder since it depends on the height of the text. I'm going to write a routine to calculate the average height of a line. So def calculate_line_height, and we'll pass in the img. So a docs for this calculates the average height of a line form a given image. And we'll take a pll.image object, and we'll return the average height of the line in pixels. Let's get a list of the bounding boxes for this image, so we'll convert this using page_seg.segment. Remember, binarize always, and we just want to pull out the boxes. Each box is a tuple of top, left, bottom, right, so the height is just top minus bottom. So let's just calculate this over the set of all boxes, so we'll set some height accumulator to be 0. For box and bounding boxes, and then the heightAccumulator = heightAccumulator + box sub 3- box sub 1. And this is a bit tricky, remember that we start counting in the upper left corner in pill. So for those of you who are used to starting in the lower left corner, not true with images, we start in the upper left normally. Now let's just return the average height, and let's change it to the nearest full pixel by making it an integer. So we'll just return and we'll type cast this to an integer which will just cause rounding. And height accumulated divided by the number of bounding boxes. And let's test this with the image that we've been using up til now. S, line_height = calculated_line_heightofimage.openreadonl- y/2call.png, and we'll just print out the line height. Okay, so the average hight of a line is 31 pixels. Now we want to scan through the image looking at each pixel in turn to determine if there's a block of white space. How big of a block should we look for? That's a bit more of an art than science, looking at our sample image, I'm going to say an appropriate block should be one character width wide, and six line heights tall. But honestly, I just made this up by eyeballing the image, so I would encourage you to play with the values as you explore. Let's create a new box called gap box that represents this area. So gap box = 00, and then our car width, and then our line height times 6, let's just look at that gap box. It seems we'll want to have a function which, given a pixel and an image, can check to see if that pixel has white space to the right and below it. Essentially, we want to test to see if the pixel is in the upper left corner of something that looks like the gap box. If so, then we should insert a line to break up this box before sending it to Kraken. Let's call this new function Gap Check, so def Gap Check will pass in an image, and a location. So here our doc's check see image, in a given xy location, to see if it fits the description of a gap box. Our first parameters are PIL.image file, our second parameter location is a tuple(x.y) which is a pixel location in that image. So we're going to pass x and y separately, we're going to pass x and y together on a tuple. We're going to return true if that fits definition of a gap_box otherwise we'll return false. Recall that we can get a pixel using the image.getPixel function from PIL. It returns a value as a tuple of integers, one for each color channel. Our tools all work with binarized images, black and white, so we should just get one value. If the value is 0, it's a black pixel, if it's white, then the value should be 255. We're going to assume that the image is in the correct mode already, in that it already is binarized. The algorithm to check our bounding box is fairly easy, we have a single location which is our start. And then we want to check all the pixels to the right of that location up to gap_box sub 2. So for x in range, location sub 0, so that's our x-value, to location sub 0 + gap_box sub 2, so that's our offset. And the height is basically the same, so let's iterate a y-variable to gap box sub 3. So for y in range location sub 1 to location sub 1 + gap box sub 3. We want to check if the pixel is white but only if we're still within the image. So if x is less than the image.width and y is less than the image.height. If the pixel is white, we don't want to do anything if it's black, we just want to finish and return false. So, if img.getPixel[(x,y)] != 255, then we'll return False. If we've managed to walk through the whole gap_box without finding any non-white pixels, then we can return True. This is actually a gap, so we'll just return True. All right, we have a function to check for a gap, called gap_check. What should we do once we find a gap? For this, let's just draw a line in the middle of it, let's create a new function. So def I'll call this draw_sep, and it'll take an image and a location. And this draws a line in the image in the middle of a gap discovered in the location. Note that this doesn't draw the line in the location, but draws it at the middle of the gap box starting at the location. So the parameter is a pill image file, and then our tuple xy which is the pixel location. So, first, let's bring in all of our drawing code. So from pil we'll import draw and create a drawing object which equals image_draw.drawimage. Next, let's decide what the middle means in terms of coordinates and the image. So x1 is = location sub 0 +, and then we'll just take our gap_box size x size divided by 2 and round to an int. And our x2 locations actually just the same thing since this is a one pixel vertical line, so we'll just say x2 = x1. Our starting y-coordinate is just the y-coordinate that was passed in which is the top of the box. So y1 = location sub 1, but we want our final y coordinate to be the bottom of the box. So y2 = y1 + the gap_box height, which is gap_box sub 3. And then we'll actually do the word, drawing_object.rectangle, we'll pass in x1, y1, x2, y2, set the fill to black. Here I'll set the outline to black, and then I'll draw some nice rule that's a vertical rule. And we don't have anything we need to return from this, because actually modify the image directly in line. All right, now let's try it up, this is pretty easy, we can just iterate through each pixel in the image check if there's a gap then insert a line if there is. So def process image, take an image, so we're going to take in an image of text and now this black vertical bars to break up columns. pil.imagefile both in and out, we're going to start with a familiar iteration process, so for x in range width and for y in range height. I'm going to check to see if there's a gap at this point. So if gap check sub, or sorry, if gap_check, and then we'll pass it the image and the tupple is True, then we're going to update the image and draw a separator in it. So then we'll just call our draw sep_imagexy. And for good measure we'll return the image we modified, so return image. All right, let's test it out, so let's read in our test image and convert it through the binarization process. So i is our new image, we'll read this in and we'll convert it to luminance here. And then we're going to call process_image, and then since we returned it, we're going to display_image. Now, you can notice immediately that this function didn't return right away. In fact, you're sitting there kind of wondering what's happening, and you can see the asterisk and the margin in Jupyter. And that tells you that the back-end processor is still working, this will actually take a fair bit of time on the course era system. And so, reflect a little bit on what's happening in the code that we've written. We're iterating over every pixel in the image both through the x and y directions. And we're looking to see if that there's a gap marks to the right and to the lower side of that pixel. Now we're going to try and draw a line if there is, and then we just go immediately to the next pixel. So you can see there's lots of opportunities for optimization of this code. And it's really meant to be a demonstration of what you can do yourself when you start combining these libraries. We're going to use the magic of video to speed this up a little bit for you, for the video lecture. But if you're following in the Jupiter notebooks and I hope that you are, please think about how you might change that to modify the image, not bad at all. The effect of the bottom of the image is bit unexpected to me but it makes sense. You can imagine that there's several ways we might try and control for this, but let's see how this new image works when we try and run it through the Kraken layout engine. So we'll say displayshow_boxes, and because we stored i it makes it easy. So it looks like that's actually pretty accurate and fixes the problems we faced. Feel free to experiment with different settings for the gap heights and width and share on the forums. You'll notice through this method it's really quite slow, which is a bit of a problem if we wanted to use this on larger text. But I wanted you to see how you could mix your own logic and work with libraries you're using. Just because CrackIt didn't work perfectly it doesn't mean we can't build something more specific to our use case on top of it. I want to end this lecture with a pause, and ask you reflect on the code we've written here. We started this course with some pretty simple use of libraries, but now we're digging in deeper and solving problems ourselves with the help of libraries. Before we go to our last library, how well prepared do you think you are to take your Python skills out into the world?