-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where are the pitfalls for adding a live stream zmq publisher service to imagenode? #3
Comments
Hi Mark, Per your question # 1: In the currently posted version of imagenode on GitHub, threading is used for 2 things:
Threading is a good idea and in my newer versions of imagenode, I am experimenting with both threading and with multiprocessing for the detectors in addition to image capture and sensor capture. As I have done those experiments, I have learned that multiprocessing may be a better choice, since the RPi has 4 cores and with Python threading, only 1 core is used for all the threads. But multiprocessing has its own drawbacks: Python objects cannot be shared directly between processes and So, my design is evolving to use more threading and even more multiprocessing, especially in imagenode. I hope to push some of the multiprocessing stuff to the imagenode repository in the next month or so. Per your question # 2: live streaming using imageZMQ is possible, but larger image sizes benefit from compressing to jpgs. Multiple simultaneous streams can slow down depending on number of senders, network load and the speed of the imagehub. I am using the ZMQ REQ/REP messaging pattern in all my applications, which requires the hub to send a REP acknowledgement for every frame received. That is a design choice; I want my RPi's to react to hub slowdowns. Other imageZMQ users have used the PUB / SUB messaging pattern and they have had some issues with slow subscribers growing larger & larger ZMQ queues -- see this imageZMQ issue #27 : jeffbass/imagezmq#27. I am not streaming frames as video in my I would love it if you would fork one or more of my project repositories ( Thanks again for your sharing your great design. Post your GitHub repo links as you push your code updates, in this thread, if you'd like to. I (and others reading these questions) can learn a lot from your work. Jeff (P.S. to PyCon 2020 viewers seeing this question: Please feel free to post additional comments & questions that are a follow on to this question by commenting on this issue thread. Please post a new or unrelated question by starting a new issue. Thanks!) |
Great talk, @jeffbass! Just one comment on what you said about mulitprocessing:
This is largely true, but the You may already be aware of this option, but I thought I'd pass it along just in case as I don't see too much discussion of shared memory features we get for free in the |
Thanks, @mivade! |
Hi @shumwaymark, I recently implemented a multithreaded PUB/SUB subscriber as an example in imagezmq, which enables one-to-many broadcast with the receivers doing realtime (potentially heavy load) processing. Implementation here: https://github.com/philipp-schmidt/imagezmq Open imagezmq pull request for further discussion: jeffbass/imagezmq#34 This was initially a response to the slow subscriber problem described here. Cheers |
Thanks @jeffbass and @philipp-schmidt ! I think I'll do exactly that. Am using the production imageZMQ library, but have forked both imagenode and imagehub with the intent to build from there. Initially will just flesh out my framework with minimal changes to imagenode: adding the async logging over ZMQ, and experimenting with a new detector. Once that's standing on its own, will layer in the video capture and video streaming features. Will eventually get code posted, and will reply back on this thread with my findings,. This is a nights and weekends project, but hope to have version 0.1 working in the next month or two. Looking forward to the challenge, and once again: a sincere thank you for the very strong leg-up! Mark |
Hi @shumwaymark , I merged @philipp-schmidt's pull request into imagezmq. He provided a great code example for implementing a multithreaded PUB/SUB subscriber. It works really well. You may want to take a look at that: PUB/SUB Multithreaded Fast Subscribers for Realtime Processing Jeff |
Thanks @jeffbass, That's one of the very first things I did, and have been very pleased with the results. It seemed to slip right into the Imagenode framework without much fuss. I thought it should be optional by camera, with a single publisher for the node. So added a couple of settings for this.
This code then follows the initialization of the sender in the ImageNode # if configured, bind to specified port as imageZMQ publisher
# this provides for optional continuous image publishing by camera
if settings.publish_cams:
self.publisher = imagezmq.ImageSender("tcp://*:{}".format(settings.publish_cams), REQ_REP=False) The following is at the bottom of the if camera.video:
ret_code, jpg_buffer = cv2.imencode(
".jpg", image, [int(cv2.IMWRITE_JPEG_QUALITY), self.jpeg_quality])
self.publisher.send_jpg(camera.text.split('|')[0], jpg_buffer) ...then again, can't help but wonder that with cabling and switches to support Gigabit Ethernet from end-to-end, maybe it would be faster to skip the compression? This is working well and has been stable. Using the settings in the yaml file above it will stream at about 45 frames/second. This is of course, tightly paired with the throughput of the Imagenode pipeline. That includes an image flip, and massaging the images prior to running a MOG foreground background subtraction on each frame. Oh, and also sending tracked object centroids over the network via the async logger. Eliminating the image flip boosts throughput to over 62 frames/second. Even leaving the image flip in place and increasing the image size to 800x600 it still runs close to 32 frames/second. This increases the size of that small ROI from 32K to 57.4K pixels, The above was actually the second thing I did. Some of my initial work was to add support for the async logging, first draft of the camwatcher module, and modifying Imagehub to introduce the Imagenode log publisher to the camwatcher. That's a separate post. |
Thanks for sharing your work. 62 frames a second is amazing. If you have Gigabit Ethernet, then |
Honestly, after playing with this for a while, keeping the frames as individual JPEG files seems the most practical. Easier on network bandwidth and the disk storage requirements are reasonable. That's not news to you, I know. There's too much overhead in saving to a video format, nor does there seem to be much of anything to gain from doing so. Analysis needs access to individual frames anyway, and the overhead for the compress/decompress of individual frames doesn't seem too onerous. Additionally, any playback might need to be able to optionally select from/merge the results of multiple vision models, including any desired labeling, timestamps, etc. Or alternatively, just presenting the original images as captured, with no labeling. Good news to report. I have the first draft of the cam watcher functionality fleshed out and working. Still a work in progress with an obvious lack of polish, but solid. I pushed everything I have so far up to my GitHub repository. Took your advice regarding threading to heart... The cam watcher has a lot of I/O requirements, so elected to implement it as a set of coroutines under a single asyncio event loop. This gets most of what's needed cooperating within a single thread. The video capture is based on the design suggested by @philipp-schmidt and forks as a sub process. One of the challenges to this design is correlating tracking data back to the captured image stream. An individual frame could contain multiple tracked objects. In the interest of keeping the image capture as lean (fast) as possible, it seemed too cumbersome to attempt to stuff all the tracking data into the message field of each published image. We may only be interested in parts, or none, of it anyway. An example use case might be an imagenode rigged with a USB accelerator which is focused on an entryway and running a facial recognition model directly on the device. Only the unrecognized face(s) require further processing. If every face in the frame is known, no further analysis may be needed. For the image stream subscriber, there is some lag between the captured event start time and the filesystem timestamp on the first stored frame. Last I looked the average latency was about 7 milliseconds. So based on the high frame rates I've been testing with, it seems safe to assume that at least the first 3-4 frames are being dropped before the subscriber comes up to speed, though it's likely a bit worse than that. The slight downside here is that for a motion-triggered event, the captured video begins after the event is already, and hopefully still, in progress. The publishing frame rate is tied directly to the length of the imagnode pipeline. My laboratory example is unrealistic so would expect a velocity well under 32 frames/sec for most real world applications. I'm not running anything slower than a Raspberry Pi 4B along with wired ethernet everywhere. For cameras permanently attached to the house, I intend to avoid Wi-Fi in favor of PoE over an isolated network segment. Where feasible. Since video playback framerate is likely much higher than the capture rate out of the image pipeline, it helps to estimate a sleep time between each frame to make the playback appear closer to real time. I'm currently dealing with this and all syncing options by estimating the elapsed time within the event to place each captured frame in perspective along with associated tracking data. Should be close to right, will know more after further testing. I couldn't help but realize that much of functionality of the cam watcher is already served by your existing imagenode/imagehub design. In some respects, I'm clearly re-inventing the wheel here. It is worth noting that the video publisher should play well with all of the existing functionality of your design, it's just adds a final step to the existing pipeline. That's a huge plus, in my view. I like the way this is coming together; I think it has a lot of potential for practical and fun applications. Live video monitoring for interactive display, and batch processing jobs, can tap into any desired video stream on-demand as well as replay/remodel prior stored events. I have hopefully pulled your most recent updates to both imagenode and imagehub and merged in my changes. A diff should show the current state of affairs. The "tracker" detector is raw and purely experimental. Code added for the log and image publishing functionality should be independent of the tracker. I didn't think to put those pieces in a separate branch. That would have been smart. Your feedback on any and all of this welcome. Will take a pause to document what I have so far and then dive in to building out the next component: the inference and modeling engine. |
This looks like great work. I think your design is well thought out. I have not used asyncio event loops and it seems like a great tool set for a cam watcher. My first reaction is that your design is a complete rethinking and replacement for my imagenode/imagehub design. I look forward to your updates and your documentation in your GitHub repository. |
Done. Truth be told, I had approached your imagenode/imagehub projects with a specific design already in mind. All my thinking was centered around the concept of analyzing and storing video content not only for analysis, but also to support video playback and monitoring by a human reviewer. I've been reading your posts in the other thread regarding the Librarian design with great interest. Only found them in late December. You already understood what only became obvious to me recently. Most vision tasks only require a few frames for analysis. My goal is not to replace what you've built. I view my project as supplementary to yours, providing video publishing and optional real-time automated analysis of a motion event in progress. Can I teach my house to learn? While brainstorming my design, I had a lot of other computer vision ideas beyond facial recognition, many of which you've either already solved, or are actively pursuing yourself. I'm going to remove my changes to imagehub. This was just a case of me over-thinking my design, where a camera node could be added dynamically, and introduce itself to the larger system. That doesn't really make a lot of sense to me in retrospect. Any new node added to the network will need configuration anyway, obviously. Keep it simple, right? I don't believe the I laugh now at my comment above about having something working in the next month or two. Not much has gone as planned for 2020. My real job has kept me busy. Thanks again Jeff. |
So, after getting that much of it working and documenting everything, it seemed like a perfect time to tear it all apart and rebuild it. This amounted to a complete re-factoring of changes made to the imagenode and removing all changes to imagehub. Have also moved to a data model based on CSV files for the data capture. Am now in a much better position to move forward with the rest of the project. I wanted to reduce the blast radius of the changes made to imagenode, so have this boiled down to a single import statement and 3 lines of code. Everything needed is now incorporated into my detector. This is contained in a separate module, along with all related modules, in a sibling folder to the The hook for this can be found in the initialization code for the elif detector == 'outpost':
self.outpost = Outpost(self, detectors[detector], nodename, viewname)
self.detect_state = self.outpost.object_tracker The initialization signature for the It works. In a perfect world I suppose the |
Was just re-reading one of your earlier replies in this thread ...have also run some experiments with using Have ben working on building out the "Sentinel" module for my project. Which is a multi-processing design, so would be very interested in learning more about any successes and/or setbacks you've encountered along the way. Perhaps an early prototype I can review? Thanks! |
I have started a complete refactoring of my librarian. While I am using multiprocessing to start independent agents (such as a backup agent, an imagenode stall-watching agent, etc.), I have put the passing of images to an independent process in memory buffers on hold. I am waiting for Python 3.8 on the Raspberry Pi, so I can use multiprocessing shared_memory. I did a few quick tests on Ubuntu and they were promising. I expect the next version of RPi OS will be released soon and it will include Python 3.8 (replacing Python 3.7 in the current RPi OS version). Sorry, but I don't have any early prototypes for you to review. An imagenode & imagehub user @sbkirby has designed and built a completely different approach to building a librarian using a broad mix of tools in addition to Python including Node-Red, MQTT, MariaDB and OpenCV in Docker containers. He has posted it in this Github repository. I like his approach a lot, but I'm still trying to build a mostly Python approach. Jeff |
Hey Jeff,
I look forward to testing your new version of software.
Stephen
Get BlueMail for Android
On Aug 12, 2021, 5:29 PM, at 5:29 PM, Jeff Bass ***@***.***> wrote:
***@***.***,
…
I have started a complete refactoring of my librarian. While I am using
multiprocessing to start independent agents (such as a backup agent, an
imagenode stall-watching agent, etc.), I have put the passing of images
to an independent process in memory buffers on hold. I am waiting for
Python 3.8 on the Raspberry Pi, so I can use [multiprocessing
shared_memory](https://docs.python.org/3/library/multiprocessing.shared_memory.html).
I did a few quick tests on Ubuntu and they were promising. I expect the
next version of RPi OS will be released soon and it will include Python
3.8 (replacing Python 3.7 in the current RPi OS version). Sorry, but I
don't have any early prototypes for you to review.
An imagenode & imagehub user @sbkirby has designed and built a
completely different approach to building a librarian using a broad mix
of tools in addition to Python including Node-Red, MQTT, MariaDB and
OpenCV in Docker containers. He has posted it in this [Github
repository.](https://github.com/sbkirby/imagehub-librarian) I like his
approach a lot, but I'm still trying to build a mostly Python approach.
Jeff
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#3 (comment)
|
I pushed the |
Thanks Jeff. Looking forward to setting this up, still excited about insuring my project fits and works well with everything you've been working on. I'm moving forward with a multiprocessing design for analyzing image frames in real time. The general idea is that an outpost node can employ a spyglass for closer analysis of motion events. A spyglass can employ one or more specialized lenses for different types of events. I'm probably overthinking things, as usual. The idea is to keep the pipeline running at full tilt so that publishing runs at the highest possible framerate, while processing a subset of images in a separate process for triggering events. Will be using the |
not sure if issue is the right place for discussion However.. |
The short answer, is yes. Just not quite there yet. My focus has been on designing and building the framework to suit my use cases. You're correct though: This may not be the best venue for discussions of that nature, since they aren't directly related to Jeff's projects. Please feel free to post such questions over on my project. Thanks. |
@vkuehn , for my own projects I am using a Python only approach. I plan to continue to do that. I am very impressed with @sbkirby's node-red approach (link above). I plan on writing a web interface using flask (which is pure Python). I also think @shumwaymark's sentinalcam design is a good one. For my own farm management stuff, the ability to send / receive text messages regarding water and temperature status was my first goal. I haven't looked into linking any home assistants. |
So @jeffbass: an update, lessons learned, and a couple of questions. First, the update As promised, built my idea for a framework to support a multi-processing vision pipeline using shared memory. And then just for fun, threw it out onto the imagenode and watched it run. Because, why not? Uses a REQ/REP socket pair from your imageZMQ library for IPC. This provides for an asynchronous pattern by employing a poll on the socket to check for readiness. See the I really like your imageZMQ library. Convenient and easy to use/extend. Thanks for exposing the underlying socket; very considerate. You are a gentleman and a scholar. Also using this for my A lesson learned Do not over-publish image data with ZMQ. Bad things can happen. I ran for months just focused on the camwatcher and fine-tuning data collection. All was well. The imagenode was happy. Then finally added that new multi-processing This has made me take a hard look at imagenode performance. Mostly I've been ignoring that because I was happy with the results I was seeing. I had noticed that the equivalent of 2 cores were tied up in what I was previously running, but never investigated too closely. After adding the new code and taming the beast I had unleashed, it now idles at around 2.6 - 2.8 cores when an Intel NCS2 which lately was added to the mix. With good results. Admittedly, there is obviously quite a lot of overhead in there. But I think worth it with regard to function, and as long as the results line up with goals, I'm OK with all that. I will eventually set aside some time to measure and chart performance against various deployment scenarios. That multi-threaded Added a throttle to dial back the publishing framerate somewhat. Not perfect, but it saved the day. Logically, this should try reflect reality of the source data, without the added cost of moving multiple copies of the exact same image over the network repetitively for no benefit. A question, or two My first foray into actually using imagehub as intended was to add a temperature sensor. The Using systemd control commands afterwards to restart or stop/start the service resolves this. Assume it just walks the process tree and kills everything it finds.
The I changed this to sleep for self.patience instead of 1.0 - problem disappeared after that. Impulse purchase: just ordered an OAK-1 camera. Looks like it is going to fit right into / add to this design. If this works as well as I think it will, it should be very cool. Mind-blowingly cool. From where I'm sitting now, it looks like MQTT and Node-RED will be an important part of the final solution. The integration possibilities with WiFi switches, and other miscellaneous IoT gadgets open up quite a world of possibility. When someone pushes the button for the antique doorbell, a switch could relay a command to display the front entry camera on a nearby monitor (perhaps you're upstairs in the bedroom). Likewise, vision analysis could be employed to detect whether a garage door is open or closed, and send the command to the connected relay to close it. Am going to need your Librarian too. Seems like an ideal vehicle for storing knowledge and state. Key functionality omitted |
Hi @shumwaymark. Thanks for your update, questions and comments. Regarding your update: Your When you set Regarding your lesson learned: I too have found that the I think My current overall system design envisions using My framebuffer design would split Regarding your question (1): It is quite challenging to exit a Python program that uses threads and subprocesses. Regarding your question (2): Why does Impulse purchases are a good thing! I also have an OAK-1 camera kit. I haven't gotten around to playing with it yet, but I want to incorporate it into I will be watching your |
Thanks @jeffbass That I got there in a round about way, incrementally. The initial prototype for the Also, thanks for the tips. I will experiment with the non-threaded camera read soon. Currently, the OAK-1 has me distracted. Will be following through with your I think your plans for First test of the OAK-1Will be implementing support for the OAK-1 as a new
Producing 3 outputs:
The OAK-1 can feed the Raspberry Pi all of the above at 30 frames / second. Object detection on every frame, the captured image, and the compressed image. With no heavy-lifting on the Raspberry Pi, leaving it free for further analysis. Or whatever else is needed. Cool stuff. |
Thanks for the update. Keep me posted on your progress. Especially the OAK-1 learnings. |
Here's a quick update @jeffbass I have this roughed-in, and working well enough so as not to be embarrassing. See the There is no documentation beyond this, and the Python code in the project. Three lines added to # --------8<------snip-------8<---------
self.cam_type = 'PiCamera'
elif camera[0].lower() == 'o': # OAK camera
self.cam = OAKcamera(self.viewname)
self.cam_type = 'OAKcamera'
else: # this is a webcam (not a picam)
# --------8<------snip-------8<--------- Everything else is in the It's really just a set of queues that need to be managed, and consumed. Currently you will only see hard-coded attributes specified as a starting point. Once the camera was un-boxed, and time invested in thoroughly scrutinizing the API, my mind exploded. Consider that the pipeline will run constantly once started so, initially, those queues fill up quickly. By the time the Outpost might decide to begin consuming data, there is quite a lot to chew on. You'll see some logic that attempts to discard some of it just to things reasonable. Don't yet understand the best design pattern for this. Much depends on how the DepthAI Pipeline is configured. There are a numerous avenues to follow. Yes, I get that storing those OAK camera queues as a dictionary at the Outpost class level is not ideal, and a somewhat quirky idiocentric design pattern, but works well with imagenode architecture. Seems that this is clearly pushing the limits well beyond what looks like your original intent behind the camera/detector architecture. Everything I said in the previous post is true. All of that at 30 FPS. With a blocking read on the image queue, you can count on a 30 FPS rate. Processing the data of course, requires time. Can the imagenode keep up with just a read loop? It does. What I've learned though, is there is more to this than meet the eye. What you'll currently find in the implementation is only breaking the ground. Still don't have the final structure fully conceived, Fair weather is upon me. My attention is soon to be consumed with outdoor projects. So wanted to get this posted to wrap-up the cabin fever season here on the eastern edge of the Great Plains. More later. Thanks again. |
Hey @jeffbass Your thoughts on a design for a ring buffer to streamline image processing struck a chord with me. So worked up my version of that into the design of the sentinel. This just builds upon what is already proven with signaling between the Since a dedicated IO thread in the main process is primarily focused on JPEG retrieval over the the network, image decompression, and ring buffer signaling, it runs fast. Much faster than the slower analytical tasks. So from a task perspective, there is not much risk of starvation, the rings just stay full all the time. Could probably even get by with a shorter length. It might be wise to move ring buffer signaling into a separate thread, and/or call for JPEG files over the network a little less often. With two parallel tasks each running Image sizes here were either 640x480 or 640x360 With just a single task, running Even with a second parallel task also running For the Intel NCS2 tasks, when the total image count processed by the task dips under 200 or a little less, the overall frame rate per second begins reducing. There is some overhead. Sets with a length of less than 100 images seemed to run closer to 10/second. Needs more comprehensive testing, but very encouraged with the results I'm seeing. I may open an issue for discussion on your imagezmq project. I'm concerned about potentially creating multiple 0MQ class DataFeed(imagezmq.ImageSender) I suspect your underlying I'm wondering if adaptions to your imagezmq to support shadowing an already existing Context could be helpful, or if this topic had ever previously come up for discussion? When a feed = DataFeed(taskpump) # useful for task-specific datapump access
ringWire = feed.zmq_context.socket(zmq.REQ) # IPC signaling for ring buffer control
publisher = feed.zmq_context.socket(zmq.PUB) # job result publication The main process for the sentinel uses a single context for all of the asynchronous sockets and ring buffer signaling. import zmq
from zmq.asyncio import Context as AsyncContext
ctxAsync = AsyncContext.instance()
ctxBlocking = zmq.Context.shadow(ctxAsync.underlying) However, for my datapump connections, that's a different story. Was wondering if you had any thoughts on this? Hope all is well with you and yours. Sincerely, |
Hi @shumwaymark, I have not tried using multiple 0MQ contexts in the same process. My own imagenode --> imagehub architecture is less ambitious than your design and hasn't needed it. I find your design intriguing, and will spend some time looking at your code. I have not used zmq.asyncio yet, so I don't have any thoughts on how it might affect your datapump connections. If you get something running using zmq.asyncio, let me know how it works for you. Does it speed things up or change the way the multiple cores are used / loaded? I'm interested in what you learn. I'd also like to hear about your experience using the OAK camera and its hardware image processing. Any learnings to share? My own OAK-D is still sitting in its box, but it is on my ToDo list! Thanks for sharing your project's progress. I'm learning a lot from it. I think your design descriptions / drawings are really well done. |
Thanks @jeffbass That would be ideal, wouldn't it? And is probably what's missing from the sentinel design. If those sockets created under your Seems worthwhile, but currently have no clue as to what's involved in making that actually work. Perhaps an item for my ToDo list. I need to measure the aggregate latency of the ring buffer signaling from the task engines to reveal how much of an issue that really is. Will also probably eliminate the use of the Still having fun with my Covid project, 3 years later. I'll get there. This piece of it is the last major hurdle. Mark |
Hi all, sorry for bothering you but I really need some suggestions from experienced developers. I have a webcam that could output both mjpeg and h264. I built a RTSP server on it to stream the h264 video to remote clients over WiFi for monitoring, and got the almost negligible latency. Now I wanted to do some object detection (e.g. yolo) over the video stream on the remote client, what is the best practice to recommend? Should I directly use OpenCV to capture the RTSP frames on the client or use zmq for compressed jpeg transmission between server and client? It seems that OpenCV's implementation of capturing and decoding RTSP stream over WiFi has considerable latency, but compressed jpeg may be slow than h264. Further more, I also need to share image date between python and c++ applications, zmq ipc protocol is fairly well and quite easy to write the code. However, shared memory might be better in terms of speed, I am not sure how many efforts I have to make to have shared memory work between these two languages? |
@madjxatw , I hope others will chime in as I don't have much experience with video codecs. My own projects use RPi computers to send a small number of still image frames in OpenCV format . The images are sent in small, relatively infrequent batches of images. For example, my water meter cam (running imagenode) uses a motion detector class to watch continuously captured images from a pi camera and then sends a few images (over the network using imagezmq) only when the water meter needle starts or stops moving. There is no continuous webcam video stream like mjpeg or h264. Just small batches of still images. If you are using a webcam streaming mjpeg or h264 then you will need to separate that stream into individual images on the receiving computer that is doing object detection. The load on the wifi network is between the webcam and the computer receiving the images. The network load will be continuous (because of the nature of webcam streaming of mjpeg or h264). The choice of mjpeg or h264 depends on many factors. There is a good comparison of them here. But most webcams cannot be modified to send small batches of individual images. Processing your video stream on the receiving computer can be done in 2 different ways: 1) using OpenCV or other software to split the video stream into individual OpenCV images and then performing YOLO or other object detection on the individual images or 2) performing object detection like YOLO5 directly on the video stream (there is a discussion of that here. I'm sure there are other ways as well). I have not used the ZMQ ipc protocol. My guess is that shared memory would be better in terms of speed. I have been experimenting with Python's multiprocessing.shared_memory class (new in Python 3.8). It works very well for passing OpenCV images (which are just Numpy arrays) between Python processes. I have not used it with C++ but it is likely that some web searching will find code that does it. I also experimented with multiprocessing "shared c-types" Objects documented here. I have found the multiprocessing.shared_memory class easier to use. I don't know which one of these alternatives would be easier to use in sharing images between Python and C++. There is a good tutorial article on sharing Numpy Arrays (which is what OpenCV images are) between Python processes here. It might help you think about your choices. |
@jeffbass, thanks a lot for your enlightening sharing, especially for those useful links! I've finally used GStreamer to implement my own RTSP stream grabber, and ZMQ pub/sub with ipc protocol for inter-process communication between python and c++ applications. The overall speed is satisfied although the cpu usage is a bit high (around 12% ~ 15%). OpenCV's ZMQ ipc protocol on Linux actually uses UNIX domain socket internally. ZMQ is pretty easy and flexible to use in cases where speed is not extremely critical. Really appreciate the article about numpy arrays inter-process sharing, I will read it soon and try out the shared memory solution. |
Glad to learn that you've found a workable solution to your problem. I've been thinking about support for h264 video for my own project. Though for my case, I'm working with the OAK cameras from Luxonis. These can not only stream the video, they can execute configurable ML workloads, such as inference, directly on the camera. The receiving system can then simultaneously collect both video and analysis results. The video format is attractive mainly due to its reduced storage requirements. My target platform is an embedded system using low voltage single board computers, primarily the Raspberry Pi, and this tends to drive all the design decisions. Have been using shared memory for inter-process access to image data. This includes the use of ZMQ over IPC for signaling purposes. Allows for running parallel analysis tasks on the capture/receiver. Collecting as much as data as possible, while analyzing a subset of those frames at the same time. Currently, for post processing, individual frames as JPG image data are transported between systems over imageZMQ, then uncompressed directly into shared memory. I suspect the slow speeds you're experiencing using ZMQ over IPC is due to the volume of data you're sending through the Unix OS socket. For my use case, the latency over this protocol is very low. The following results are summarized across about 20 tasks running MobileNetSSD inference.
For ZMQ over IPC, the average latency per frame here is 0.000489 - I'm cheating of course, these are very small payloads using MessagePack for marshalling. My approach to video analysis would be much the same. Would rely heavily on OpenCV for getting image frames out of the stream, then copy into shared memory. Or ideally pre-allocating the NumPy array in shared memory, and just using that for the OpenCV storage. I believe there should be lots of examples of using shared memory between Python and C/C++ since these are the same underlying libraries. Have never tried that though. Good luck! Your architecture sounds well thought-through. Honest disclaimer Do not always get performance figures that high. Those numbers are due to adding an Intel NCS2 stick for the neural net. Have seeing varying performance stats, and the frame rates above are probably higher than average. I'm usually happy to see facial detection, manipulation, and embeddings collected at around 10-12 frames per second. |
Hey @jeffbass, Just looking back at that first post. April 2020. Over four and half years. The genesis of my Covid Lockdown project, and haven't given up yet. The primary use case for just the facial recognition and learning is essentially complete. With the wall console more or less working now, the infrastructure to run this is finally all in place. All I have left is to workout the refinements and overall strategy for the machine learning life cycle. This has been a more challenging problem than I fully realized when I started. I knew it wouldn't be easy. Knew it would take a while. And now, that it's almost finally working, it looks like the OS I've been using (Raspbian Buster) is basically end-of-life. I think I'm inclined to just keep on keeping on for now. Not sure how practical that really is. Looks like you've been keeping your library current. How are you approaching the planning of all of your Ying Yang Ranch deployments? Do you upgrade? Not upgrade? Replace them when and if they finally break? I'm also looking at the toys I've got. Have a few of the Intel NCS2 sticks, have those working. A couple of Google Coral accelerators that I've been planning to get to, but haven't yet. The Luxonis camera that I picked up is working with what is basically the same software as the NCS2. Also picked up an Arducam PiNSIGHT. Truth is, I've spent a lot of time just crafting infrastructure. It's felt good that I was actually able to build something that met my expectations. Some of the coolest stuff I've ever written. It's a hobby project that has seen months of inactivity. Now that I finally have it built, it's time to get it fully working. Designing the machine learning lifecycle for it, and building and teaching all the clever ideas I have in mind. A couple of the aforementioned hardware toys are now kinda dated. And will become more difficult to support in the future. A conundrum? In some respects, it seems smart to keep using the software that's working. Sigh . It's even smarter though to keep everything current, to be able to easily take advantage of current technology. What are your thoughts? Some of these libraries are large, and tough to get built and running and working together on a Raspberry Pi. Mark |
Hi @shumwaymark! Great to hear that your projects are moving forward. I've been continuing to build and refine mine as well. The short answer to your questions is that, Yes, I've been upgrading all my Yin Yang Ranch deployments. The old versions were working fine, but reaching end of life on all the packages I was dependent on. Many of my project improvements have been related to becoming current with newer versions of Python, Raspberry Pi OS and Picamera software. My Yin Yang Ranch system was working fine with the older versions, but there are lots of reasons to try to stay (mostly) current. Security and bug fixes in the new Raspberry Pi OS versions have been substantial. Picamera2 is worth the upgrade frictions. The RPiOS Bookworm defaults to a 64 bit OS rather than the older 32 bit OS. This is very helpful with memory buffers and with Picamera2 codecs and packages. The biggest incentive for me was the significant improvements in Picamera2 camera software versus the older Picamera software. My imagenode Raspberry Pi cameras are a key part of my overall system and were running fine on Raspbian Buster. But Picamera stopped being supported. Buster support was dropped as Raspberry Pi moved on to Bullseye and then to Bookworm. Picamera2, although still in Beta test, is now stable and maintained. Also, the newer Bookworm RPiOS has Python v 3.11 as its system Python. The improvements in all of these made it worth the time and effort to upgrade all my software. PiCamera2 is based on the open source My package My hardware toys are also being upgraded. The newer RPi 4 and RPi 5 computers are faster and have lots more memory, which is easier to take advantage of using the new Python versions (e.g. some of the Numpy memory buffer Python modules). Newer Python versions and Picamera2 use the RPiOS 64 bit version. RPiOS 64 bit also enables newer & better versions of many software packages. The new PiCameras are great: Picamera HQ & Picamera Global Shutter cameras even allow different interchangeable lenses. And libcamera / Picamera2 can take advantage of more settings for exposure control which have been very helpful for my critter cams. I have been using Google Coral accelerators, but haven't tried the Intel NCS2 sticks. I am also playing with a couple of Luxonis cameras, but I haven't got any of the stereo vision software working reliably. I want to upgrade So, for me, the upgrades to the latest software & OS versions have been worth it. Upgrading these packages has also helped me think about streamlining my designs to make future upgrades smoother and easier. I built my prototypes without thinking about that very much. Lesson learned. I have not made any progress on any of my machine learning stuff. I'm glad to hear that your are making some progress there. As always, I look forward to learning from you. Jeff |
Thanks @jeffbass, I new that was the correct answer I suppose. Working with and helping to manage a distributed data system, on a much larger scale, is my day job. So duh. Laziness will only get me in trouble. What got me to this point was a pyimagesearch course purchase which came with a ready-to-run Raspberry Pi SD card fully loaded with software. That's been my base since the start. So have been more than lazy in that regard. It worked great though. I've known all along that the good times would one day be over, and I would be forced to roll my own eventually. No time like the present. If you need a tester for imagenode/imagehub. I'm available. By the way, finally got around to fixing my ImageSender subclass, (DataFeed), to support an async recv() with a timeout, close, and reconnect. I didn't realize there was an existing fork on your library which achieved a similar result. I just used that same polling operation I've been using for IPC for a non-blocking read. Then added @philipp-schmidt's trick with the wait on an event and raised timeout from his PUB/SUB example. Glad he shared that. Great idea. A little surprised to be back looking at this in late summer. This project has been mostly a wintertime pursuit. Building the watchtower was an unexpected summer sideshow that I've made some time for this year. Didn't realize that was the big missing piece, until I built it. Funny, huh. Lack of a plan, and still got there. Mark |
Thanks, @shumwaymark, for your offer to help test the next versions of imagenode & imagehub. Might be a while before I get to those, but I'll let you know when I do. |
Jeff,
For the TL;DR on this, just scroll to the bottom to get to my questions.
A brief introduction for context...
Have been brainstorming a personal side-project for the past few months, and feel like I'm ready to start putting it together. The motivation was just that this is something that seemed interesting and fun. And also as possibly a cool learning vehicle for my grandson
The goal is a small-scale distributed facial recognition and leaming pipeline hosted on a network of Raspberry Pi computers. Something that could easily support presence detection within the context of smart home automation. Have bigger/crazier ideas too, but this was a good place to start.
Had learned about imagezmq from the PylmageSearch blog, and that led me here.
Just being completely honest here, my first reaction to your imagenode and imagehub repositories went something like... Awesome! I'm going to steal a bunch of this stuff.
Well done. And after looking at it for awhile, I've come to recognize that what you've built is a much closer fit to my design than I had initially realized.
My initial goals here are to be able to recognize people and vehicles (and possibly pets) that are known to the house. Identifying package and mail delivery. Knowing when a strange car has pulled into the driveway.
Significantly, any new/unknown face should automatically be enrolled and subsequently recognized. We can always make a "formal introduction" later by labeling the new face at our leisure. Or deleting any that are not wanted.
I wanted central logging of errors and exceptions rather than keeping them on the SD card of the camera nodes. Using PyZMQ async logging for both that reason and to capture camera event data. A single detector could potentially generate a number of different result values in sequence: there can be multiple objects.
To support this design pattern, the camera startup goes something like this.
8, Camera initialization completes and processing loop begins
This allows cameras to be added and removed dynamically. The cameras can push out a periodic heartbeat over the log as a health check. The cameras just need to know which data sink to connect to. The data sink then introduces the cam watcher.
Most inference runs as a batch job on a separate box(s). Some inference can be moved onto specific cameras that have USB accelerators where real time facial recognition is desired, such as the front door or foyer. All results are stored in a database.
Motion event playback can produce the original video, and support the inclusion of optional annotations/labeling. i.e. show the bound boxes around each face along with a name.
Does any of this design interest you? I guess what I'm trying to ask in a round about way... Should I just fork your stuff and move on, or would you like any of this for yourself?
PyCon 2020 Questions
It looks like the imagenode camera detectors run single threaded. Was this a design decision, or is there more to that than meets the eye?
What are the pit falls for adding a live-stream video imagezmq publishing service on the imagenode?
My thinking on that second question, is that it might be desirable to tap into the live camera feed on demand. This would support not only monitoring on a console or handheld, but would also allow a batch job to analyze a motion event while it is in progress.
Most cameras wouldn't have a subscriber, they would just routinely publish on the socket, it would be available for any application that might want it.
Thanks Jeff!
Mark.Shumway@swanriver.dev
https://blog.swanriver.dev
The text was updated successfully, but these errors were encountered: