Viewing all threads involving Nikita
Juggling Data Set.
I have created a juggling data set. It consists of several videos of different juggling patterns, graphs, and .csv (spreadsheet) files. Some of the data is visualized in this Youtube video: Juggling Pattern Waveforms.
To create the dataset, I first took video of myself juggling. Next, using computer vision, I recorded the locations of the juggling balls in a spreadsheet. Once the data is captured in a spreadsheet, it can be easily analyzed. For example, this is the data for the pattern 423:
Collecting the data (the positions of the balls) from the video and recording it in a spreadsheet requires specialization in computer vision, particularly object recognition and tracking. Now that the data has been taken from the video and put in a spreadsheet, it is much easier to analyze!
The data can also be analyzed using Python (with the help of packages Pandas and Matplotlib). This short Python script shows how to load and graph the data.
One of the applications of this dataset is to compare two different jugglers who are juggling the same pattern. But, for this, I need your help! I would like to include juggling from other jugglers in this data set. If you could supply me with a high-quality short video clip of juggling, I could track the locations of the balls and add it to the data set.
I don't quite see what result you trying to achieve here, but I'd really like to help.
I'll try to record something tomorrow.
I have several goals for the juggling data set:
"If you build it, they will come."
I don't know what people will do with this data set! Hopefully, this data will have some really cool applications in the field of machine learning.
I gave it another thought. You probably can find as many juggling videos as you want on you tube.
If you have any specific requirement for you soft to work, than you better list them.
I considered using videos from Youtube (This is the data from an 11 ball flash.).
There are a couple problems with using videos from Youtube:
I guess it's also obvious that the props should be visible all the time, so that for example claws hiding a ball are unapt for tracking ( or will the software just go on tracking where the ball appears again with data just not for only that hidden part of the balls trajectory, a hole in the rendered curve then ) ?
Yes, it is helpful if the props are visible. Sometimes the tracker will get lost if the ball is obscured, enters a different lighting condition, or changes direction rapidly. If the tracker gets lost, I can pause the program and reset the tracker to the appropriate position.
I can implement optical flow in Python, but I don't really understand it. Explaining how it works is way beyond me.
This video shows how optical flow works: https://youtu.be/-1ebo0YjQw8
Notice what happens when the ball passes in front of my face. The tracker is most likely to get lost there because the ball appears to be a blur on those frames.
What steps follow optical flow? I assume you're first segmenting the flow vectors to find the balls? Are you then using a predictive tracker that can predict the object's position when not detected, such as a Kalman filter? It must be possible for the tracker to know that the balls travel in parabolas when in the air, which would help tracking a lot.
The tracking program that I wrote is quite simple. It relies on the user to click on an object to be tracked, and then uses optical flow to track the object. There is no detection, that is done by the user. Segmenting flow vectors is not something that I had considered, but I will try to implement that.
I have considered using a Kalman filter, OpenCV has a Kalman filter class, and I am currently researching this. The use of a Kalman filter is probably the best tool to make juggling ball tracking software more robust.
Optical flow is more than adequate for tracking the balls when they are in the air. Most of the tracking problems I encounter are when the ball has a sudden change in velocity as it is caught or thrown.
Tracking juggling balls in highly optimized video is a fairly easy computer vision task, a heuristic approach to solving my tracking problems is to collect video in higher resolution. I am currently working with 1080p30fps video.
Any particular patterns you'd like me to film? I'll try and do this next week.
Yes, can you film 15 catches of 5 ball cascade (from launch to collect)? I have noticed that good jugglers tend to throw the first ball (or maybe two or three balls) a little higher than the average pattern height. Jugglers that are just learning the 5 ball cascade tend to throw the first few balls a little low and then struggle into the pattern.
I want to get video from several jugglers, all juggling the same pattern for the same amount of catches, so that I can compare things like average pattern height, dwell time, height of the first throw vs. the average height of the pattern, syncopation, etc...
Thank you for you interest in the juggling data set.
Sorry for the delay. I was going to film outside just now and then realized my balls are in the gym. Just letting you know I still plan on doing it :)
No rush! The juggling data set is a long term project for me.
A couple notes about uploading video:
Of course I read this post while it was already uploading to youtube >.<, so I'll post it here anyway for those interested. It's also on it's way to your email inbox. You can use this video in any way you like, including but not limited to using it in your data set.
Sorry for the terrible terrible quality! I hadn't tried 60fps mode before, at least now I know it is completely useless. If you prefer I can film it again in 30fps, which should be sharp 1080p.
I have a super wide lens and the camera allows me to zoom, probably some kind of digital zoom which made the sharpness even worse, but the lines of the building still don't seem exactly straight. I hope this doesn't skew the data too much.
Anyway, if you have wishes, let me know and I'll film it and send the raw stuff straight to you.
Thank you for sending me your video! Even though the resolution was not great, the contrast between the balls and the rest of the frame was amazing! The balls were very easy to track.
I have used my tracking software to extract the data, here are the results:
Now that I have data from another juggler, I can work on methods to compare your juggling to mine.
I'm watching this project with interest.
I'm wondering if professionals working on a synchronised routine would be interested in your software to analyse a video of their routine to easily identify areas where they are out of synch?
Most of the time it is easy to see for the professionals themselves when they are out of synch, no software needed. Perhaps for really precise things, such as isolations, this could be an application. But this software is not for that..
Excellent, both the juggling and the tracking. I particularly value seeing the start which inspires me to clean up my start to include the whole trough of the curves before the first releases.
I was surprised to find that the act of juggling started quite a while before the first ball was thrown. I'd like to collect data from multiple jugglers to see if everyone does two fake throws before launching the first ball.
I often do several (more than 2) fake throws when starting 5, but launch in to a lot of patterns with 3 straight away.
I tend to do the fake throws for 5 until I "feel" like I have the right rhythm and spacing.
Now for the stuff I'd be curious about:
In long runs, how stable are my throws? Like could you give my pattern a score based on how much the throws diverge from the average of all throws?
In what way do my L and R hand differ? Do they both throw equally high and wide, or how big is the difference?
In the shorter runs, it is completely fine for me if every ball behaves different. Here I don't want to compare throw by throw, but rather run by run. I gave 3 samples of 15 catches, I'd be curious to know if ball 1 of attempt 1 is similar to ball 1 of attempt 2 and 3. etc.
In the future it could be interesting to compare jugglers and their styles too. Average height, dwell time, width, etc.
In all of the scores above I'd like to bring to 0 difference of course, and if there was a tangible number it would be fun to keep track over the years and try to become as much robot like as possible.
Anyway, those are some ideas of what to do with the data :). I hope you find some uses for this. And since seemingly I'm only the second juggler you've got on video: Guys, record some 5b 15 catches! Super easy stuff, right?
I have been working on determining dwell time and throw height. I do eventually want to create some metric that can describe how far a juggling pattern is from perfection. This is an interesting question though, because I don't really know what perfection is in terms of juggling.
I wouldn't steer towards the question of what a perfect throw would be, that can be answered by others over time. You could however determine how far a throw is from your own average, or how far it is from the throws of this one juggler who you think throws pretty perfect.
My thoughts about a perfect throw/pattern can be found here:
Since throws would be perfect parabolas (are they?), can't we visualize most of the data when we'd just have the starting point, the highest point and the catching point of each throw? Would diminishing the data like that make it easier to analyze, possibly easier to visualize? Is it hard to acquire these 3 points from the data?
In the absence of wind the center of mass free flight paths are perfect parabolas to pixel resolution but the parabola may not be in the plane the camera is digitizing. There is also a little uncertainity about the end points. If you look at the velocity and acceleration plots you can see the noise. You can also see that, for the example shown, less than half is for free flight.
Tracking the ball during throws and catches is difficult for three reasons:
The easiest solution is to record video in a higher frame rate; 120fps is sufficient for most 5 ball juggling. Another solution is to use smoothing techniques on the data to reduce noise. I have applied smoothing techniques to Daniel's data, and these are the results (quite an improvement):
Right now I am focused on collecting good data, by working to make my tracking program more accurate.
I have identified five critical points that I would like to find: toss, crossing point traverse, zenith, catch, and toss onset. These critical points can be found in the data by using statistical techniques.
I think that your data request is not specific enough. If you want to compare the same pattern for different jugglers then you should specify exactly what data you want. I think you'd have a lot more response for that. I may be interested in filming something if you have a specific request. I have a nice plain wall outside and some bright balls.
For the purposes of comparing one juggler to another, it would be best to eliminate as many variables as possible:
It would be easiest to do this at a juggling convention, with one camera setup and an 'x' on the floor for the juggler to stand over.
Until then, if you (or anyone else) would like to have a short clip of juggling tracked and recorded in a spreadsheet, I am more than willing to help.
Come do this next year at the Dutch juggling championships. Plenty of ball jugglers who all believe they have the best 5b cascade, you can help them (dis)proove it and collect data at the same time!
I made sure to get explicit consent before including you in the Juggling Dataset to avoid future conflict if the analysis proved that your juggling is bad. Looking at my own data, and comparing it to yours I can see that you juggle much better.*
For jugglers that spend more than an hour a day with props in the air, juggling can become part of who they are. Finding out that they are not good at juggling (by some however arbitrary measure) can be a hard pill to swallow.
*I blame it on the balls.
Indeed, sadly, it's just an arbitrary measure... I'm not convinced that my stable juggling is any good until I get close to your 7b records ;)
Do I see correctly that this graph only displays the height of the balls over time? Variation in width of a throw and horizontal spacing would also be important to determine if your juggling is "clean".
Our graphs look very different. I assume the horizontal scale is based on the amount of frames? It wouldn't be too hard to adjust for actual time, no?
From the video with the overlay it seems like my two hands have opposite problems:
My right hand makes throws from a consistent horizontal location, but the throws vary in width. My left hand makes throws from different locations, but somehow manages to correct the width in such a way that the balls all land at exactly the same spot! I could have never noticed this without the visual aid. Can you check if this is also true in the longer runs? Can you think of a way of graphing out the data of the x axis in a way that this would be readable?
Of course, the same kind of graph but downwards, that could work...
(if only I could edit posts to include the thoughts I have seconds after posting, this thread wouldn't need to become a mess)
Also, the color coding is different in the graph and in the video. Sticking to one consistent order would make it easier to analize.
Sorry for the overkill of input, I'm just excited and curious to study more of my juggling this way :D
Yes, these graphs plot height on the Y-axis and time on the X-axis. I would like to include the horizontal position of the balls, but that would mean graphing in three dimensions, which I have only a little experience with (I'll try it in Matplotlib for Python, but if anyone has suggestions...).
The graph of your pattern is more stretched out because yours if filmed in 60fps and mine is filmed in 30fps. I am working on a way to normalize the data so that different frame rates can be compared.
I see that you have replied to your own comment several times! Glad to see you are excited about this. I am excited about this too, but as this is a long term project for me, I am going to take my time and try to produce high quality data and avoid publishing things only to have to retract them later.
Hello Stephen, I've been following this thread closely, it's a great project. I'll try to get some footage of myself juggling soon and send it to you.
I haven't got round to using Matplotlib yet because I've had access to Matlab while I am a student. I have done a fair amount of plotting, animating and other stuff. If you're willing to send me an example csv file of some juggling, maybe a column for each ball and each row being a discrete time step then I'd love to have a go at plotting some stuff. I can just send whatever I come up with to you, the Matlab code won't be directly useful because I suspect the syntax might be a bit different but it should be a nice little prototype.
The data for my pattern can be found here:
@Stephen: Until a 3d graphic is readable, 2 separate 2d graphics would also help to analize the juggling
Sorry Daniel, I wasn't paying attention to who-is-who there so my previous message was directed to you. Thanks for a link to the data, I've had a quick play around with it.
My working folder: https://drive.google.com/open?id=0B9vCeC0EU8QLWHNiUE03aVFmZDA
My first quick test plot: http://i.imgur.com/t6CSrXk.png
Don't worry, this is Stephens data file anyway (of my pattern)
Looks interesting, I guess I'd have to view it in 3d to be able to see if it is analisable...
Nice graph! Did you smooth the data?
I made a .gif of the same graph using Python and Matplotlib, here's a link: Small: 3mb, Medium: 5mb, Large: 9mb.
You can find the code used to generate these (and lots more) in the Python Tutorials folder of the Juggling Dataset.
I didn't smooth the data but I did choose thick lines which can have a similar effect.
If your gif is animated then my phone isn't showing it properly, I'll check tomorrow on a desktop. Hopefully I'll make the switch to Python soon so I can use your stuff and my work won't be tied up in proprietary, expensive software... it's just so convenient though!
I'd like to do some animating of the data. It would also be interesting to seperate each 'cycle' to see how consistent the throws are. This steps into statistics and is a bit outside my territory but it would be a fun project to learn more with.
I made a few more animated graphs and uploaded them to Imgur for viewing on mobile:
These graphs are also available in the graphs folder in the Juggling Data set.
I have learned a lot about statistics and computer vision working on this project. You can find the codes that I used in the Python Tutorials folder (they aren't pretty).
It would be interesting to see how this relates to what Joost Dessing was doing at EJC2014 (a brief description was in Scott Seltzer's general review of EJC2014).
Thank you for that interesting link, I look forward to reading his publications. It seems that Dr. Joost Dressing and I have different aims for the data that we collect.
My goal for the Juggling Dataset is to create a dataset that others (who are so statistically inclined) can use to study juggling. I want the data to be as accessible as possible, similar to how the creators of Kaggle have made the Titanic Dataset accessible; by hosting the data online with a thorough description and several tutorials in multiple programming languages.
I assume Dr. Dressing collects data to prove or disprove a hypothesis and then publishes those results for academic prestige.
Early on in the creation of this data set I realized that creating a data set is difficult. There are several steps, and each of which requires specialized knowledge, software, and equipment to do well: juggling, video recording of juggling, computer vision analysis of juggling video, formatting data and hosting it online, writing tutorials and descriptions for the data...all of this has to happen before a hypothesis can be tested.
Fields like machine learning and data science are really starting to take off! By creating the Juggling Data set, I hope to leverage the machine learning practitioners and data scientists skills to advance the knowledge of juggling. They are nothing without data, but likely lack the ability to collect it. There are a lot of data addicts, but few data set creators.
I just recently joined the forum and I have few questions about record logging.
Hope you can help me :)
So the thing I'm confused with, is order of throws in siteswap notation.
Lets say I want to record 56 throws of 12345 (we all know this trick, and there is no questions that it is called 12345)
So I record 56 3b 12345.
My actual throws started with 3. Like 3451234512... So why do I record 12345 instead of 34512 then ?
12345 and 34512 considered different tricks by site engine, but they actually are the same one.
And how would I record this trick with additional 6 throw ? Is it 1234560 or 0123456 ?
Do we have any rules of resolving situations like this ?
Also, do you count 2 and 0 as a catch ?
And is there a way to tag people in posts ?
The convention among jugglers is to typically write down or say a siteswap starting with the highest value. Sure, you might get into the pattern with different throws, or start the loop at a different place, but for clear and easy communication, it's nice if everyone sticks to the same order for the same pattern.
This means 97531 is always easy to recognise, rather than each time the reader having to decode that 31975 and 19753 and 75319 are all the same pattern.
In your case, you are wrong that there is no question that your pattern is called 12345, as you yourself then explain. The convention is to call it 51234.
As for counting catches, with running a siteswap pattern it's often easier to count the cycles of the pattern. For example, here is a video of the b97531 record. Catches aren't mentioned, but "151 rounds" is:
I've never heard of this convention. 51234 sounds strange. Why not start a siteswap at its easiest point of entry? If it is a ground state siteswap, it's always obvious. 45123 makes much more sense to me than 51234. Obviously it's going to be 97531 and not 19753
Besides, how do you solve for siteswaps with recurring high numbers? 777171 could be written in 4 correct ways then?
I'm not an expert on states, but the excited state pattern 891 doesn't start with the highest number either. I believe that the easiest entry is 778, wheras the easiest entry for 918 is 7788?
"and there is no questions that it is called 12345" I think siteswap wise it should be called 45123, but 12345 is the obvious style choice. Which indeed makes counting tricky. You could link 45123 to the 12345 trick in the record section, claim that your version is the correct one and ask the current record holders for clarification of their counting method.
For myself I would also count cycles, not catches, but I understand that in the record section that doesn't work... I'm sure someone who uses the record section actively can comment on this?
The convention is actually to write the siteswap in numerical order. So 777171 would be the only correct way to write it. I assume that the reason is that it was convenient for early siteswap generators to write them in that format without having to work out the states. Writing them with highest values first is most likely to result in a low state start, unless you work out the states.
Well, Peter already answered this. It's the highest numerical value if converted into a single number. 777171 starts with 777 which is higher than 771, 717, 171 or 717.
As for this: "I'm not an expert on states, but the excited state pattern 891 doesn't start with the highest number either. I believe that the easiest entry is 778, wheras the easiest entry for 918 is 7788?"
Let's write those down.
Into 891 is 778? So that's 778891891891...
Into 918 is 7788? So that's 778891891891...
Yeah, you've just come up with the same thing!
Of course they're both the same thing! But that still doesn't tell you whether you should write 891 or 918, right? And my generator & jugglewiki do call it 891, not 918...
I've never heard of that convention either. I've only ever heard 12345 called 12345. Searching for 12345 vs the other permutations on rec.juggling & the Edge (both the forum & the records section), 12345 is by far the most prevalent.
If two siteswap records are entered into the Edge records system that are a rotation of each other & provided you have built up enough 'experience' by entering records you will have the option to merge those two tricks together. Once merged you can enter the trick whichever way you like but they will be listed & compared together.
At present no-one has entered a permutation of 12345 to link to.
I have heard of that convention, and I would certainly write any 4-handed siteswaps that way (I believe that most of the passers do that).
When logging my juggling practice, however, I usually write ground-state siteswaps in the order I do them, so 51234 would be written as 45123, since that way I can say that I did 4 rounds and back to cascade or something like that. (While if I wrote it as 12345 or 51234, the same number of actual throws and catches would only contain 3 rounds and the first and last throws would count as transition throws...).
I don't log siteswap records so it doesn't matter in that case, but if I did, I'd feel that doing for example ...3333345123451234512333333.... would be 15 catches of 45123, but only 13 catches of 12345. (If I do active 2s, otherwise I would not count them.)
Someone might also have noticed that I sometimes log both 55050 and 50505 in the same practice session... Or 552 and 525. In that case, it's just different starts and has little to do with how to write siteswaps and more to do with me wanting to see in my log entry that I actually did two different starts, but being too lazy to use a lot of words. (55050 would be starting with one club in one hand and two in the other, and throwing from the hand with one club first. 50505 would be starting with 3 in the same hand.)
Same here .. heard of that convention and use it for logging records, but in a given context write them as then makes more sense.
I've heard of that convention and yet also commonly heard patterns referred to in ascending order, eg 12345. (Also the convention for writing multiplex throws also follows the "highest possible number" convention. eg , not . I learned that from Sean Gandini.)
In the single case of the pattern 12345, yes, that order is the most common by far. It feels like the natural way of saying it. However, it's in a class of patterns where throw values increase by one until it drops to the first number again, and in most other cases, the higher number is said first. Examples:
423 not 234
534 not 345
645 not 456
It's only because it feels natural to say the number 1 first that people do so! And it *is* so satisfying to say it that way! In conversation, I've no problem with saying 12345. I think it's more important to have clear communication between two people than to follow strict rules.
But in the case of making a list or database entry, I think it's best to stick with the convention. And if there's any confusion, explain the convention, not make exceptions for something that just happens to scratch a weird cognitive itch.
What if others (like me) find it easier to start it with the 3 ? i.e. 34512
There are many patterns that people don't start the same way, so the easy way for you isn't necessarily the easy way for others, hence the need for the convention as discussed by Peter and Luke to help everyone recognise a given siteswap from all it's possible cycles.
Yes, this is the whole point. When ordering lists, the person reading it should know where to look, and also not think they are missing anything, and also not worrying that two things in different places are duplicates. This is why bookshops and libraries have settled on (within sections) ordering books by the author's second name, and then the first name, and then by book title/series title and number. If you went into a bookshop, and some books were ordered by the title, and some by the author name, and some by the colour of the spine, everyone would be super annoyed.
In the case of siteswaps in a list, or in a database like the records section, the obvious thing to do is order them by A: object number and then by B: numerical value.
This is important because, just looking at the siteswap out of context, it's impossible for most people to know the state of the pattern, or how they would transition into it from the cascade or fountain, or any number of other things.
And it's really important not to have 777171, 771717, 717177, 171777, 717771 an 177717 ALL listed in different places, or else the list would be unmanageable! You'd also have to have 441, 414 and 144 listed. And every other iteration of every other pattern, just in case another juggler liked starting on a different beat or had a different transition into the patter,
If someone is confused, it's much better to explain the convention to that person (eg. "in the book shop we order by author surname") than it is to accommodate their preference at the expense of making the system more complicated and confusing for everyone else.
With regards to counting numbers. Don't count 0s, & only count 2s if they are active (thrown); if you are just holding the prop don't count it as a catch.
There currently isn't a way to tag people in a post, I don't think traffic is really high enough to warrant the feature? If you just type someone's name in all capitals I'm sure they'll get the message.
Thank you, Orinoco.
Not counting 0 makes and passive 2's makes sense. I just realized than I counted number of cycles and multiplied it by number of digits in siteswap. But it turned out to be wrong.
Taging people look nice. As I see it. Just Highlighted name makes it clear to random reader that message has direct recipient. It also would allow to send emails to tagged people if they want it, for example. I don't think everyone read forum from end to end. But it is minor.
I'm much more interested in resolving my siteswap issue.
You mentioned an option to merge two tricks together. I have not found info about it anywhere, could you please comment on it ?
If there is a possibility to merge tricks and make two different entries behave like one trick would really solve it.
But also may be i't is possible to pre process siteswaps programmatically to make all versions of one trick recorded with same string. (like if you enter 315 it is still shows up as 531)
I'm not sure if it is a right way.
Go to any trick in the records section and look at the bottom of the page for "Is this trick the same as another? Link them together". However, based on what Orin said earlier it may not be available unless you've added a lot of records.
...& that link will appear if you've logged more than 10 records. This is just an arbitrary threshold so that linking is only handled by people who are at least a little familiar with the records system. I couldn't remember what the threshold was when I posted earlier so just had to look up the code!
Glad to join your community, hope to find and share some value on juggling edge.
Now you have one more juggler from Russia here along with Ilia Poliakov.
Hi & welcome!
find and share some value .. care to specify (or do you mean "in general") ?
[ Ilya Poliakov .. yeah, I saw his admirably huge repertoire and records, and sometimes he smalltalks :o] ]
Subscribe to this forum via RSS
1 article per branch
1 article per post
Green Eggs reports