Activity position in waiting to import list

I used Citystrides in the past, and then stopped for a while, recently been getting back into it. Over the last couple of months, last night I became a paid for supporter as I figured it was about time I supported this project, use the node hunter facility and also to get priority syncing.
I noted the number of activities waiting appears to be in freefall, its risen approx 10,000 in the last 6hours or so.
My latest activity from 6 hours ago is logged as scheduled for import. However I have no visibility of how long it will take to be imported.

Itā€™s obvious you have a problem in terms of sheer number of activities coming in vs those you pull from Strava. However it would be great if there was some indication of where current activities are in the list of those to be imported.
I understand that for non supporters it may appear that their activities are not progressing if paid supporters activities are continually jumping ahead. So maybe this is only displayed to supporters.
If there was a estimation of time linked to that position in the list then that would be great. But Iā€™d even be happy to work it out for myself. As if you can import 15,000 a day and Iā€™m 10,000 in the list its going to take approx 16hrsā€¦
Less any of those used for new users imports etc.

Yeah, I need to improve/add to How Syncing Works

There is no ā€œplace in the queueā€ or a static order of things. I have X number of activities that all want to be synchronized, and any/all subscriber activities have priority before any non-subscriber activities are synchronized. When the Strava API limit is reached, any lingering activities to be synced are scheduled out after the next reset (be it a 15min or 24hr limit reach).

The fact that these jobs are attempted, then potentially rescheduled (limit reached) means that thereā€™s no visibility into how long a job has been in this whole system - each attempt/retry/requeue/reschedule is a new instance.

One change that I intend to add is to have historic/full-account Strava syncing go into a lower priority queue so that new activities can make their way into CityStrides and the history-filling wonā€™t block that. This means that new users have a degraded experience, but itā€™s basically my only option at this point.

It also seems there is a bit of a backlog on Strava syncs.

And, this may have been mentioned elsewhere already, and it has, I will find itā€¦

But the Strava activity number(s) are no longer on the status page.

Thank you James!

Eric

Ahh good to know. Am I right in thinking then
If there are 2,000 supporter activities waiting to import.
A random 300ish will be imported in 15mins, and the rest will be put on hold, 15mins later, another random 300will be imported from the 1,700 (plus any new activities) (not sure if these are actually 150 every 15mins at 2api calls per activity multiplied out to 30,000 a day)

I assume the 200,000 activities are a mixture of supports, non, regular activities and new user imports?

If so is it possible to see how many other activities are in the same group as regular supporters?
E.g. You have 1 activity to be imported along with 5,000 equivalent supporter activities out of the 200,000 total.

I can see this is a difficult problem to solve, I am just keen for as much information as possible to give an indication of how long it might take for my activity to be seen.

Do you have any plan to implement a system whereby the activities of supports are imported in chronological order as opposed to the current system?

As the other commenter to this thread, I also noticed the actual activity numbers waiting are no longer listed on the page.

A random 300ish will be imported in 15mins, and the rest will be put on hold,

Yeah, basically, but it gets a little tricky/nuanced - there may be some semblance of order to things behind totally random. Thereā€™s just not a strict order, because the job system is multi-threaded. If it can bring in ~50 activities at a time, then there will be some vague representation of order on the scale of every 50-100 activities ā€¦ put another way, I wouldnā€™t expect the first enqueued activity to suddenly somehow swap order with the last enqueued activity.

not sure if these are actually 150 every 15mins at 2api calls per activity multiplied out to 30,000 a day

new activities - those alerted by webhooks ā€˜costā€™ 2 API calls each. The historic activities ā€˜costā€™ 1 API call per 100 activities for a user and then 1 API call for each activity. So a batch of 100 activities for a user would ā€˜costā€™ 101 API calls.

I assume the 200,000 activities are a mixture of supports, non, regular activities and new user imports?

Yeah, thatā€™s a count of everyone.

If so is it possible to see how many other activities are in the same group as regular supporters?

These counts are expensive and Iā€™m noticing performance issues as the queue sizes increase. Thatā€™s not to say it canā€™t be done - itā€™s just not heading down that path as things are written right now.

Do you have any plan to implement a system whereby the activities of supports are imported in chronological order as opposed to the current system?

No. The current system is wildly efficient - I push through 20-40 million jobs per day. Iā€™m not expecting any different system to outperform this. This problem is not with my background job system, this problem resides solely on Strava having a very low API limit for my 12,379 current Strava-connected users with a growth rate of 55-227 new Strava connections per day over the last couple weeks.
Further, this system is ā€œchronological enoughā€ for me. The only place that this breaks down is when API limits are grossly exceeded (the past two weeks).

the actual activity numbers waiting are no longer listed on the page.

I can no longer display this information.
Iā€™ve revamped the historic syncing from a single job that would attempt to sync 100 activities for a user in one go (which basically limited that syncing to 3 people each 15min period) ā€¦ to one where the metadata for all the activities for the user are collected from Strava & enqueued in my job system. This increases the activity queue load to hundreds of thousands of activities, but makes it much more likely that all people will slowly be synchronized together (as opposed to one person at a time).
Iā€™ve also de-prioritized this historical syncing so that subscriber activities first priority, then any new activity is prioritized, then the historical sync is worked on if there are more API calls left.


Strava has been completely unresponsive to my requests for an API rate limit increase. Iā€™ve explained to them that Iā€™m currently very close to the ceiling (12379 * 2 API calls per activity = 24758 API calls per day; leaving 5242 API calls for historical syncing) and that Iā€™m roughly 60 days away from reaching that and being fully saturated.

Perhaps itā€™s time for those of you affected by this to submit support requests: https://support.strava.com/ (ā€œSubmit a requestā€ in the top right), or publicly mentioning this to them on Twitter: https://twitter.com/stravasupport (Start your message with something like ā€œHiā€ and then include the at-mention, so that itā€™s a public tweet and not a reply thatā€™s only shown to them). :man_shrugging: I donā€™t know what else to do at this point.

1 Like

Thanks for taking the time to explain all of this James.

I see why your hands really are tied with all of this.

I will raise a ticket with strava to try and help.

I think I must have been lucky the last week or so as my non premium member activities have been syncing within 18 to 36 hours. But looking at the totals as it stands currently 210,000 is 7 full days with no more activities and new activities/ new year syncs are coming in faster than it can import them as that total is rising.

I was wondering if it was worth trying a service such as tapiriik and moving new activities to mapmyrun and importing them to citystrides that way, but they would still be in strava and would then mean you would need to import both which doesnā€™t help you, as I assume the system wont disregard a strava activity if a mapmyrun activity exists with the same name and time and date?
I wouldnā€™t want to disconnect my strava connection as I donā€™t want my historical data being erasedā€¦
Thanks again,
I will look at strava now.

I know Iā€™ve asked this before but I donā€™t think I actually saw an answer. Why not go directly to Garmin and get the data from them?

1 Like

Iā€™m just some random average idiot on the internet, but I posted over 3 weeks ago about this growing problem basically asking what your plan was to fix it. In those 3 weeks the problem has grown exponentially (not the only thing to do so) and I still honestly donā€™t see a long term plan here. Right now you have paying supporters that are not getting adequate upload times and forget about it if youā€™re a non-subscriber. I feel like we were at a shit or get off the pot moment a couple weeks ago. Not to poke at a sensitive subject, but this website must be generating decent cash flow and it really needs some serious attention. At this point itā€™s possible that you missed capturing all the new covid mappers but some quick decisive changes could still keep them around. Strava is probably a dead end because they have no interest in going out of their way to help you but all of your focus has been on them. Garmin costs money and has an upfront labor cost to you but clearly seems like the long term strategy. Obviously switching over to Garmin will be much tougher now that there are 2000 more members than there were 3 weeks ago, but something needs to change here quickly if you want the site to be successful.

The last thing I saw on this is that Garmin charges a one-time fee of $5,000 for access to their APIā€¦

However, I read somewhere else, that Garmin dropped the fee. This page very well may bare that out: https://developer.garmin.com/health-api/overview/ (details of whatā€™s covered in the API are at that link too)

GET THE API

Access to our Health API is free to approved developers.

I guess the trick is becoming an ā€œapprovedā€ developer. :wink:

But the one thing I am most curious about is, would switching to MapMyFitness or Runkeeper help? I find CityStrides more enjoyable than Strava, though I would miss the metrics. But maybe they are there in the Garmin app? I was using the Strava app before I had my Garmin, so kept rolling (and running) with Strava. If one is ā€œbetterā€ than the other, which one? I have no experience with either.

[edit] It was hjkiddk that originally posted about the API is free, in the ā€œpoll threadā€ It was bugging me that I could not remember where I saw that.

You gotta spend money to make money. Spending the 5k a month ago when trouble was on the horizon would probably have paid off by now and maybe there would be more than the 2000 new users then have already joined in that time. Itā€™s hard to keep paying subscribers when the site isnā€™t even working.

Totally agreeā€¦it is a now world and to have to wait a day for data to be loaded is something the market wonā€™t tolerate. The moment someone else comes along and figures out a way to have their data get imported immediately, people will flock there. For anyone who is running everysinglestreet, itā€™s difficult to go out the next day and know what route you want to run when you canā€™t look an your own up to date map.

1 Like

I agree. I just joined a week ago and was really excited to find a website that would create this type of map. I primarily use Garmin, and signed up for Strava simply to be able to sync with City Strides. Already frustrated though with the delay in syncing. I wish I could simply upload the activity manually myself, it would be way faster than waiting for this damn Strava sync.

Iā€™m sure many others are frustrated too and second-guessing whether or not City Strides is worth using with these massive delays.

2 Likes

Iā€™ve been mostly ok with waiting longer than usual lately as I know there are so many new members and activities, however with that being said it used to be worst case scenario Iā€™d go to bed and wake up with my activity having counted and I could go after new streets that day just fine. Now it seems that processing times can be over 24 hours which can be frustrating as Iā€™ll have done another run and the previous run still isnā€™t there. Also, I like to look at my map as it slowly gets more filled in. If 3 or more activities get filled in a day or more later it takes some of the fun away as you donā€™t get to see your progress as you go.

2 Likes

I agree with this idea - I understand why it doesnā€™t work in the current framework of Strava priority syncing - but Iā€™m pretty frustrated. I was excited to find this site. I poked around, connected to Strava and then waited and waited. Then I found the forum, thought becoming a supporter would help so I ponied up and contributed some cash - still nothing. I tried to sign up for runkeeper, but their signup process is broken.

I donā€™t have any idea when this site will be useful for me, but if itā€™s more than a few days/week, Iā€™ll either delete my account or (more likely) forget about it for a long time, only to remember it as the service that I couldnā€™t use. I donā€™t mean to be harsh, I hope it didnā€™t come across that way.

I made a Runkeeper account a few days ago, just to be ready if Strava is not gonna help us.
They support bulkupload of GPX so it only took 10 uploads to make a full sync.

Have you tried signing up again?

1 Like

Just tried with a different email and it workedā€¦strange. Thanks for the encouragement!

Thereā€™s also https://tapiriik.com/ for anyone looking to bulk-move data around.