![]() I strongly recommend you follow other recommendations in this thread and try a quick transfer learning tutorial before you dive into building a custom model. Do not train your own model until you've tried a transfer learning model. Not sure which fork of Darknet your using but if you download this one you can calculate mean average precision very easily and create charts and all that- the readme is very useful as well. Otherwise 608 should give you way better results. If you can afford to run the input layers at 1024, def try it. Try messing with your burn time too maybe a smaller number like 100-200 will get your model to learn faster with a smaller burn time. You want your loss value to be under 1 and I think 2000 iterations will get you there. One thing I remember it not mentioning however is adjusting the classes in the yolo layers and changing the size of the filters in the convolution layer before each yolo layer based on the number of classes (classes+5)x3 so 1 class you would be 18, etc.I would certainly train for more than 1000 iterations and go for something near 2000. Thanks for the answers! I'll probably try the extended training with larger input dimensions before I dig into customizing the model config and all the expected errors. cfg file input layers from 416 to 608 (or go even bigger?), and train for much longer to see if the model better learns how to truly find object and center the rect around the object. So I was thinking my next steps were to change my. I have no random stray boxes on other parts of image, but on closer inspection I can see the model is only generally finding my objects (the bounding boxes are not well centered on the real objects). In 1000 steps my model reaches average error around 17, and model can predict bounding boxes over most of the small objects in my image. I followed this transfer learning tutorial and had some decent results. The objects we're looking for are nearly rectangular so i defined ROIs that surround the rectangular shape and show some distinct intensity transitions on each edge. I'm looking at a manufacturing product and looking for specific features that we've previously detected with intensity thresholding and edge detection with reasonable success. I saw a page about transfer learning in PyTorch and it would appear that loading weights into model can be used to freeze the other gradient layers, and then only the YOLO layers have active gradients.ĭoes this mean I could customize the convolution layers to change the analysis scale, load existing weights, and then activate gradients on those modified convolution layers to re-calculate the weights? But I'd like to leverage past models if at all possible. I guess the safest bet is to start training a custom model. cfg file with new or modified yolo layers? Will the weights still apply? Or likely this would result in errors? Is it possible to complete transfer learning if I modify the. I've seen some mentions of adding extra yolo layers or changing scale-space to examine the smaller objects. I was wondering if you have any experience in training yolov3 for a very small objects? Objects that are probably 2-5% of image width. Since the dataset is very specific and the images are similar, I'll have to train the net from scratch or can I use the Darknet convolutional weights from another model, as is done for other custom datatasets and classes?Īny ideas that may contribute is very welcome and I thank you in advance if you took the time for reading and/or answering these questions.Or, will I have to create 2 models? The first to detect the bigger bounding box and then crop that region of interest and feed the cropped area to the second model to detect the smaller ones? Therefore I'll need to train two models which will be more expensive.Does YOLOv3 accept an annotation like that, where smaller bounding boxes are inside a bigger bounding box?. ![]() The dataset is very specific and the pictures are similar and an annotated picture would look like this (I can't post a real picture for business reasons): The way I think I'll be able achieve accurate object detection is through deep learning, and from what I've read, I think that YOLOv3 may be suitable. In our case, the application will receive pictures taken from smartphones, so there will be a lot of variable conditions such as lighting intensity, camera quality, lighting color, shadows etc. I am performing some object detection with image processing techniques, but at this point it seems clear it will never be good enough for production. I've been working with OpenCV for 1 month now on a project and the results for our application seems good, because I managed to get the data I want from the pictures, but it is far away from a production level application. Hi all, I'm new to this community and new to computer vision as a whole.
0 Comments
Leave a Reply. |