联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-04-16 09:21

AM Project 3: Real-time 2-D Object Recognition

1/6

Project 3: Real-time 2-D Object Recognition

Due Feb 24 by 11:59pm

Points 30

This project is about 2D object recognition. The goal is to have the computer identify a specified set of

objects placed on a white surface in a translation, scale, and rotation invariant manner from a camera

looking straight down. The computer should be able to recognize single objects placed in the image and

show the position and category of the object in an output image. If provided a video sequence, it should

be able to do this in real time.

Setup

For development, you can use this development image set

(https://northeastern.instructure.com/courses/175350/files/24760264?wrap=1)

(https://northeastern.instructure.com/courses/175350/files/24760264/download?download_frd=1) . Once

you have your system working, you'll need to create your own workspace for real-time recognition. If you

have a webcam, the easiest thing to do is set it up on some kind of stand facing down onto a flat surface

covered with white paper. If you have only a built-in laptop camera, find a section of white wall in front of

which you can hold the objects to be identified. Alternatively, you may be able to stream your phone's

camera video to your laptop. If you can get a setup where with a downward facing camera on a white

surface, that is the easiest thing to do. If none of those work for you, capture some videos or still images

with a phone and set up your program to process single images or video sequences.

Your system needs to be able to differentiate at least 5 objects of your choice. Pick objects that are

differentiable by their 2D shape and are mostly a uniform dark color, similar to the dev image set. The

goal of this project is a real-time system, but if that is not feasible given your setup, you can have your

system take in a directory of images or a pre-recorded video sequence and process them instead. Make

sure it is clear in your report which type of system you build.

If you are working on your own, you must write the code from scratch for at least one of the first four

tasks--thresholding, morphological filtering, connected components analysis, or moment calculations.

Otherwise, you may use OpenCV functions. If you are working in pairs, you must write the code for two

of the first four tasks from scratch.

Tasks

1. Threshold the input video

Using the video framework from the first project (if you wish), start building your OR system by

implementing a thresholding algorithm of some type that separates an object from the background.

AM Project 3: Real-time 2-D Object Recognition

2/6

Give your system the ability to display the thresholded video (remember, you can create multiple

output windows using OpenCV). Test it on the complete set of objects to be recognized to make sure

this step is working well. In general, the objects you use should be darker than the background, and

the whole area should be well-lit.

You may want to pre-process the image before thresholding. For example, blurring the image a little

can make the regions more uniform. You could also consider making strongly colored pixels (pixels

with a high saturation value) be darker, moving them further away from the white background (which

is unsaturated).

It may also be helpful to dynamically set your threshold by analyzing the pixel values. Running a k?means algorithm on a random sample of pixel values (e.g. using 1/16 of the pixels in the image) with

K=2 (also called the ISODATA algorithm) gives you the means of the two dominant colors in the

image. The value in between those means may be a good threshold to use.

Required images: Include 2-3 examples of the thresholded objects in your report

2. Clean up the binary image

Clean up your thresholded image using some type of morphological filtering. Look at your images

and decide whether they are displaying noise, holes, or other issues. Pick a strategy using

morphological filtering (for example, growing/shrinking) to try to solve the issue(s). Explain your

reasoning in your report.

Required images: Include 2-3 examples of cleaned up images in your report. Use the same images

as displayed for task 1.

3. Segment the image into regions

The next step is to run connected components analysis on the thresholded and cleaned image to get

regions. Give your system the ability to display the regions it finds. A good extension is to enable

recognition of multiple objects simultaneously (but this is not required). Your system should ignore

any regions that are too small, and it can limit the recognition to the largest N regions.

A segmentation algorithm will return a region map. If you try to display the region map directly, it won't

tell you much because the region IDs will all be similar. A better method of displaying the region map

is to ignore regions that are too small and to pick from a color palette for the remaining regions (e.g. a

list of 256 random colors). Once you have removed regions that are too small, it can be helpful to

renumber the remaining regions in sequential order. If you want to avoid color flickering in your region

map display, you will have to remember the centroid and color for each region in the prior image and

try to match centroids and colors in the next image.

OpenCV has a connected components function. If you don't write your own, use the one that also

returns the stats. If you chose to write your own, both the two-pass connected components algorithm

and the region growing methods are good algorithms to know.

AM Project 3: Real-time 2-D Object Recognition

3/6

A good strategy to use when getting started is to focus on the region that is larger than a certain size,

is most central to the image, and does not touch the image boundary. The region size, region

centroid, and region axis-oriented bounding box are all you need to make this comparison. Those

statistics are easy to compute from a region map, and they are provided by the built-in OpenCV

connectedComponentsWithStats function.

Required images: Include 2-3 examples of region maps, ideally with each region shown in a different

color. Use the same images as for the prior tasks.

4. Compute features for each major region

Write a function that computes a set of features for a specified region given a region map and a

region ID. Start by calculating the axis of least central moment and the oriented bounding box.

Display these overlaying the objects in your output.

Important: for this assignment, the purpose is to analyze regions, not boundaries. Use region-based

analysis, not boundary analysis, for generating features.

Percent filled and the bounding box height/width ratio are good features to implement first. Any

features you use should be translation, scale, and rotation invariant such as moments around the

central axis of rotation. Give your system the ability to display at least one feature in real time on the

video output. Then you can easily test whether the feature is translation, scale, and rotation invariant

by moving around the object and watching the feature value. Start with just 2-3 features and add

more later. There is an OpenCV function that will compute a set of moments. If you use it, explain

what moments it is calculating and which ones you are using as features in your report. Whatever

features you choose to use, explain what they are in your report and show the feature vector for the

images included in the report for this task.

Required images: Include 2-3 examples of regions showing the axis of least central moment and the

oriented bounding box. use the same images as for the prior tasks. Include the computed feature

vectors for each object.

5. Collect training data

Enable your system to collect feature vectors from objects, attach labels, and store them in an object

DB (e.g. a file). In other words, your system needs to have a training mode that enables you to collect

the feature vectors of known objects and store them for later use in classifying unknown objects. You

may want to implement this as a response to a key press: when the user types an N, for example,

have the system prompt the user for a name/label and then store the feature vector for the current

object along with its label into a file. This could also be done from labeled still images of the object,

such as those from a training set.

Explain in your report how your training systems works.

6. Classify new images

AM Project 3: Real-time 2-D Object Recognition

4/6

Enable your system to classify a new feature vector using the known objects database and a scaled

Euclidean distance metric [ (x_1 - x_2) / stdev_x ]. Feel free to experiment with other distance

metrics. Label the unknown object according to the closest matching feature vector in the object DB

(nearest-neighbor recognition). Have your system indicate the label of the object on the output video

stream. An extension is to detect when an unknown object (something not in the object DB) is in the

video stream or provided as a single image.

Required images: include a result image for each category of object in your report with the assigned

label clearly shown.

7. Evaluate the performance of your system

Evaluate the performance of your system on at least 3 different images of each object in different

positions and orientations. Build a 5x5 confusion matrix of the results showing true labels versus

classified labels. Include the confusion matrix in your report. Hint: you can add hooks into your code

to make this easier if you can tell the system if it is correct.

8. Capture a demo of your system working

Take a video of your system running and classifying objects. If your system works on still images,

show the visualizations your system makes. Include a link to the video in your report.

9. Implement a second classification method

Choose from the following options.

A. Implement a different classifier--something besides nearest neighbor--of your choice. For

example, implement K-Nearest Neighbor matching with K > 1. Note, KNN matching requires

multiple training examples for each object and should not use the voting method.

B. Use a pre-trained deep network (here)

(https://northeastern.instructure.com/courses/175350/files/26232326?wrap=1)

(https://northeastern.instructure.com/courses/175350/files/26232326/download?download_frd=1) to

create an embedding vector for the thresholded object (example code here

(https://northeastern.instructure.com/courses/175350/files/26232332?wrap=1)

(https://northeastern.instructure.com/courses/175350/files/26232332/download?download_frd=1) )

and use nearest-neighbor matching with either sum-squared difference or cosine distance as the

distance metric.

Explain your choice and how you implemented it. Also, compare the performance with your baseline

system and include the comparison in your report.

Extensions

Write a better GUI to show your system in action and to manage the items to your DB

AM Project 3: Real-time 2-D Object Recognition

5/6

Write more than one (or two if working in pairs) of the stages of the system from scratch.

Add more than the required five objects to the DB so your system can recognize more objects.

Enable your system to learn new objects automatically by first detecting whether an object is known

or unkown, and then collecting statistics for it if the object is unknown. Demonstrate how you can

quickly develop a DB using this feature.

Experiment with more classifiers and/or distance metrics for comparing feature vectors.

Implement both options for the last task.

Explore some of the object recognition tools in OpenCV.

Report

When you are done with your project, write a short report that demonstrates the functionality of each

task. You can write your report in the application of your choice, but you need to submit it as a pdf along

with your code. Your report should have the following structure. Please do not include code in your

report.

1. A short description of the overall project in your own words. (200 words or less)

2. Any required images along with a short description of the meaning of the image.

3. A description and example images of any extensions.

4. A short reflection of what you learned.

5. Acknowledgement of any materials or people you consulted for the assignment.

Submission

Submit your code and report to Gradescope (https://www.gradescope.com) . When you are ready to

submit, upload your code, report, and a readme file. The readme file should contain the following

information.

Your name and any other group members, if any.

Links/URLs to any videos you created and want to submit as part of your report.

What operating system and IDE you used to run and compile your code.

Instructions for running your executables.

Instructions for testing any extensions you completed.

Whether you are using any time travel days and how many.

For project 3, submit your .cpp and .h (.hpp) files, pdf report, and readme.txt (readme.md). Note, if you

find any errors or need to update your code, you can resubmit as many times as you wish up until the

deadline.

AM Project 3: Real-time 2-D Object Recognition

6/6

As noted in the syllabus, projects submitted by the deadline can receive full credit for the base project

and extensions. (max 30/30). Projects submitted up to a week after the deadline can receive full credit

for the base project, but not extensions (max 26/30). You also have eight time travel days you can use

during the semester to adjust any deadline, using up to three days on any one assignment (no fractional

days). If you want to use your time travel days, indicate that in your readme file. If you need to make use

of the "stuff happens" clause of the syllabus, contact the instructor as soon as possible to make

alternative arrangements.

Receiving grades and feedback

After your project has been graded, you can find your grade and feedback on Gradescope. Pay attention

to the feedback, because it will probably help you do better on your next assignment.


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp