Comments: "Infrastructure, infrastructure, infrastructure!"

Moderators: support, Unholy, krankyd, The Fej

Comments: "Infrastructure, infrastructure, infrastructure!"

Postby AlanOfTheBerg » Fri Nov 04, 2011 3:00 pm

Please use this thread for comments and discussion of Ehud's Infrastructure, infrastructure, infrastructure! post announcement.

eshabtai wrote:We've recently released our new 3.0 client for the iPhone (Android is coming soon!) and got really incredible feedback on it. However, some of you shared your feelings with us, telling us that our priorities are wrong. Instead of a redesigned client or new features you think that we'd better focus on our infrastructures first as building new stuff on top of a shaky base does not make much sense.

While infrastructure is a general term and includes different services, we are fully aware that a lot of the things we promised from day one, are not working as they should. This is mostly true regarding our "daily process" which includes points, client tiles update and map editing. In fact, from your perspective they have not improved over the last year or so, despite us telling you that "we are working on it".

The fact is that the majority of our dev team is actually working on our infrastructure and we have not stopped that work in order to add new features. We have always been constantly redesigning and deploying new improvements to our backend servers. So why aren't you seeing any real world improvements? There are several reasons for that:

  • We are growing fast. While a lot of things have not improved they have also did not deteriorated which means we did improve on our capacity. But not fast enough. What we saw is that every time we deployed a new redesigned backend service, in just a few days new users would come and fill in the available capacity.
  • Some improvements are transparent. Twice in the last year we had a big down time of several hours. We rely on AWS to run our services and when they had issues our service went down. So we have been working on the last few months (and still are) on redeploying all of our core services in a redundant configuration where each service runs on at least two AWS availability zones. That way if one zone goes down our service will keep running on another zone.
  • Serialized processes and dependencies. Our "daily process", historically was a serialized process which involved analyzing users tracks, updating points (partially based on the tracks analysis), updating tiles, switching over to the new tiles, etc'. Since every such task depends on another one, every issue in every task causes a delay in the whole process. So while we have been improving and optimizing the different parts of the chain, there was always a weak link causing delays.
  • Deployment and testing. Deploying infrastructure changes is a slow process as the effect of a bad deployment is pretty much disastrous.
So, while we have been working on improving our infrastructure, at the end of the day, it was not fast enough to address your demand and we have been actually limiting your willingness to contribute time and help us improve the service. We have also lacked in communicating the issues to you, the things we are actually doing to solve them and how much time it is going to take. Once again I'm now going to tell you that things are going to change soon for the better. Instead of just promises I'll try and describe where we stand and when is it coming.

  • Cartouche. We have recently relaunched the new cartouche with a new UI which was a big improvement over the old one. But this was not just a UI project. We have also rewritten the backend on which the old cartouche was running. The new backend allows us to simply add more and more servers as the demand grows and I'm sure you noticed that it runs much faster. We are still limited by the amount of saves we can process per second but for now we have a lot of room to grow there.
    We launched the new backend sooner than we planned as the old backend just couldn't handle the load and we had no way to scale it. Unfortunately we still have some issues we are looking into and while we solved a lot of them our cache currently has some bad data which we need to rebuild. We plan to start working on that in a few days and hopefully this will solve a lot of the inconsistencies you encounter.
    Until we rebuild everything, in a few days we will launch a new update which will bring back the detailed error messages. These messages will help you figure our what's wrong with your edits and in many cases fix the issue and resave.
  • Points. We have been working on breaking most of the dependencies which were required for our points calculation. The calculations rely on a huge amount of events which our different services produce. As part of scaling out our backend, we are now using Hadoop to process it. That way we can easily add more instances as the amount of data grows. Our first goal is to be able to predictably update the points every day without relying on other processes. Most of the development was done and we plan to deploy it in a week or two. Our next step, which have already started to work on, is to be able to update the points in near real time. More updates on that as soon as we finish our next phase design.
  • Merging you tracks. We have previously rewritten the process which merges tracks and it is already able to scale out by just adding more processes. However, only now we have finally developed a supporting system which can actually accommodate and distribute the data for merging. This now enables us to do some parts of the merging process in near real time. Our goal is that within a hour from a drive you will see your track on cartouche along with any new roads and cameras you added. You will also get permissions to edit along your drive immediately.
    Some parts of the merging process such as detecting turn restrictions and road directions is still done in offline once a day. But we are working on improving that part too and make it near real time as well.
    The new system is under final testing and is planned to be deployed within a month.
  • US/World deployment differences. Map problems are still shown only on the US system. We are planning to deploy it on the world servers as well but I don't have an ETA yet. Our policy is not to have differences as it makes our life harder to manage different configurations, however in this case a gap was created and we are trying to close it as soon as possible.
Improving our "daily process" is currently our first priority on the infrastructure tasks. We are working hard on breaking all the dependencies and redesigning everything to allow scaling out easily. Our first goal is to reach predictable iterations as currently I'm sure it's pretty frustrating working on the map and having no idea when will it be available on the client. As we progress, we'll keep updating you!

Ehud
Last edited by AlanOfTheBerg on Sat Dec 03, 2011 3:41 pm, edited 1 time in total.
Reason: Unsticky due to time
AlanOfTheBerg
Waze Champs
 
Posts: 13739
Joined: Sat Aug 28, 2010 8:48 pm
Location: Oregon, USA
Has thanked: 116 times
Been thanked: 418 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby AlanOfTheBerg » Fri Nov 04, 2011 3:04 pm

I am very appreciative of the announcement and the information provided. One glaring omission, unless I am misunderstanding, is the performance and upgrades/updates to the routing servers. Those are nearly the first thing a new user interacts with as they attempt to get a route to a destination, and today, these routes are still failing to have full integrity on the client. Every day I still get a route which has gaps in it. Maybe that is actually a problem with the map itself and not an issue with routing?
Oregon-based US Country Manager | iPhone5 - VZ - iOS 6.1.2 | Waze v3.6
Image
Wiki Resources: Map Editing Manual | Oregon Project/To-Do List
AlanOfTheBerg
Waze Champs
 
Posts: 13739
Joined: Sat Aug 28, 2010 8:48 pm
Location: Oregon, USA
Has thanked: 116 times
Been thanked: 418 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby The Fej » Fri Nov 04, 2011 4:18 pm

Gaps indicate an inconsistency between the tiles located on client and routing server. This should not happen as client is checking if tile "dirty bit" to see if needs be refreshed, but in case routing server tiles are updated and client tiles update has failed (they are different tiles in our server), this might happen, though in practice, server tiles should not be updated without client tiles updated simultaneously as well.
This will be tackled as part of the tiles bug corrections, until we perfect the process.
The Fej
 
Posts: 1067
Joined: Sun Feb 22, 2009 12:02 pm
Has thanked: 0 time
Been thanked: 3 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby noam » Fri Nov 04, 2011 4:31 pm

Just to be clear - the routing server performance IS being worked on. We are working on better distribution and scaling of the server. We don't have an ETA on it but it obviously keeps us up at night (literally)
noam
Waze Champs
 
Posts: 5
Joined: Thu May 07, 2009 12:34 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby jondrush » Fri Nov 04, 2011 4:55 pm

What are you doing to address the regular loss of the availability of map editing? Will one of these improvements cover that? I'm not trying to be a downer. I think everything you've listed here looks great.
Keeping the Waze maps tidy since 2009
North East Region Coordinator
ImageImage
jondrush
Waze Champs
 
Posts: 1489
Joined: Tue Sep 22, 2009 10:20 pm
Location: South Eastern Pennsylvania, USA
Has thanked: 25 times
Been thanked: 41 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby argus-cronos » Fri Nov 04, 2011 5:14 pm

oh man i love such statements, thanks guys :-)
argus-cronos
Waze Champs
 
Posts: 4217
Joined: Sat Apr 03, 2010 3:00 pm
Location: Basel, Switzerland
Has thanked: 205 times
Been thanked: 268 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby dohartman » Fri Nov 04, 2011 5:35 pm

Thanks for the update, hopefully all of this will happen soon, I personally know of a few friends who started using Waze in the last few weeks because I told them how great it was, but because their points were never updated and it always showed them as "joined today" they stopped using it.

Also, any word on why the 3.0 bugs will be fixed on the iPhone? Specifically the one around the events radius limits not working, i.e. setting it to 5 miles is ignored and you see events from any distance.
dohartman
 
Posts: 5
Joined: Fri Jul 08, 2011 2:52 am
Has thanked: 0 time
Been thanked: 0 time

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby a4xrbj1 » Fri Nov 04, 2011 6:45 pm

noam wrote:Just to be clear - the routing server performance IS being worked on. We are working on better distribution and scaling of the server. We don't have an ETA on it but it obviously keeps us up at night (literally)

Noam,

it's not only the routing server performance. For months now I get no learning from the routing. Every time during the day Waze is trying to calculate the silly route to drive to KLCC in Kuala Lumpur, which isn't my daily and I only drive there once in a while. Yet, I drive almost daily to work and back home using probably 3-4 different routes depending on traffic. Guess what, Waze routing server is only giving me one route if I enter either work or home as target and it's in many cases not the fastest one.

I've checked the routes, they're all connected but it seems the learning has stopped. Probably due to the infrastructure as well.

But what is a routing software worth if it doesn't route? Still your no 1 problem is that you guys have retention rate similar to Zynga, hardly users stay longer than a couple of days, you're lucky if the stay more than 1 month.

That's what needs to be addressed as otherwise your potential prospects are all burned by a bad user experience in the first place. And I know quite a bit about that as I'm responsible for our 12+ million paying users here at the no 2 Telco in Malaysia, in my role as SVP of Customer Value Management. If we wouldn't had developed a proper welcome program, followed by a proper cross/upsell our clients would be fleeing in the same way as it happens with new Wazers. Where is your customer lifecycle program? How are you guiding the user from lifestage to lifestage? Where's your prediction of attrition from the user, do you react to changes and decline in daily usage? No, nothing (probably as you don't have anybody who a) has a clue about that and b) you didn't even know that it's important.

A large percentage of them is leaving as they don't find anything on the map, not their destination nor their current position. That's due to the map not well deployed. That's due because your company doesn't place enough emphasis on supporting new users by creating a welcome program. How do you react to the fact that a new user is starting at a location where there is no map yet? Where's your tutorial to react to this event and turn maybe the bulldozer on for him? How do you communicate at all to new users, to react to different events? Just giving them points for their first update request send isn't good enough. You have to react automatically to it with the right explanation/tutorial. Go and play any Zynga game on Facebook and you will understand how perfectly they take care of their new joiners and keep them connected to their game. What do you do?

Then your gamification approach isn't working for two reasons. One is the most obvious one, no points or not getting what is promised (+100 points) as the user fulfills his part (like first week you drive 2 days) is a complete turn-off point and as mentioned by dohartman is another main reason for leaving after just coming on board. The second thing is that incentifying with extrinsic motivators only works for a certain period of time (see Foursquare as the perfect example how quickly Badges and Leaderboard/Points wear off) and even leading to less usage/fun which is called overjustification. So you need to find a way that the intrinsic motivation (like for me building something that is potentially saving gasoline by being able to avoid jams and taking the fastest route, therefore saving our planet and some lives along the way as no jams means also less accidents) is replacing the extrinsic motivation to climb up your leaderboard. The intrinsic motivation is the pleasure of beating the obstacles on our way to home/work by using Waze to avoid the jams, being faster, being smarter than those not using it. Yet, you're not communicating to the user on that intrinsic motivation to connect him further to Waze (like "Congratulations, by using this route suggested by Waze rather than your standard route you arrived 5 minutes earlier!").

Lastly, focusing on crowdsourcing is a great idea but it's worth nothing if your users don't get it (hardly any of the new users understand that they are responsible for building the map) and if you're not focusing as well on helping your superusers (like us in the Top 15 worldwide with all more than 150000 edits) to build up that map. The same rule applies like in many other internet based apps (like Twitter eg), 90% is passive, 10% is active and less than 1% is doing 90% of the work!

But yet we're bothered on each of those 150000 edits with the software being buggy and not supporting us in how we spend our free, unpaid time! We still have to switch between the old Cartouche (for seeing update requests) and the new Papyrus as we eg clear stray roads left by bad GPS reception (as deleting them is much faster on Papyrus vs the one after the other deleting on Cartouche). Same for working on update requests, finding errors in the road segments.

Not even speaking about not giving us any proper communication tool to interact with the endusers, on their update request. Hence 90% of them is useless as there is no comment and we have no time to spent 5-10 minutes figuring out why something was reported (and often wrongly classified as eg missing bridge). I setup a Facebook page on "Waze in Malaysia" which has 177 engaged users but getting them there is a nightmare as your tool is missing the communication part between users (and don't tell me to use the PM system here, it's a nightmare as well).


You see Noam, there's a lot more that can be done, a lot more that can be done better but after now more than 2 years it's time to get your act together. Upgrading the infrastructure is actually hygiene, not even worth to talk about and it should work (but we accept some glitches as it's a free service). You have more than 4 million email addresses from former users (assuming all of them registered), that's the perfect basis to start your Customer Lifecycle Program!


Hope this lengthy post is helping making Waze better, I haven't fully given up on you guys after my rant a couple weeks ago but I almost stopped all work. I've overcome some of my frustration and I'm willing to help again in my sparse free time, as it somehow relaxes me from the stress at work to work on map edits (another intrinsic motivation BTW). I just don't know how long my patience lasts.

Cheers,

Andreas
Country Manager for Malaysia and Singapore (and Brunei it seems)
Country Manager of Malaysia, Singapore & Brunei
My Waze videos on editing: http://bit.ly/YTr3z0
Join our FB group: http://on.fb.me/12t1CVU
or Google+ group: http://bit.ly/VDNSjO
Image
a4xrbj1
Waze Champs
 
Posts: 201
Joined: Tue Aug 31, 2010 3:35 pm
Location: Kuala Lumpur, Malaysia
Has thanked: 18 times
Been thanked: 3 times

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby andrewfatcat » Fri Nov 04, 2011 6:53 pm

I have never been this satisfied with the current V3.0 client and Cartouche. V3.0 has never crashed and the new Cartouche really makes editing much easier.

There is only one thing I think that needs to be done as soon as possible: I would like to see Toll-road and Carpool lanes as new road types. It shouldn't be difficult to add just these two road types for map editing and we have been asking for this but it has not been done. The routing function to avoid them can be added later, but you got to have these road types first.
Image
AndrewFatCat
Area Manager in Houston, TX, U.S.A.
andrewfatcat
 
Posts: 138
Joined: Sun Sep 27, 2009 6:36 pm
Has thanked: 0 time
Been thanked: 1 time

Re: Comments: "Infrastructure, infrastructure, infrastructur

Postby gettingthere » Fri Nov 04, 2011 8:36 pm

noam wrote:Just to be clear - the routing server performance IS being worked on. We are working on better distribution and scaling of the server. We don't have an ETA on it but it obviously keeps us up at night (literally)


WOW. A post from the Waze CEO!

Thanks for taking the feedback from your dedicated users seriously! We are just as interested as you in Waze being successful (even without a financial incentive)!

We hope that you and your employees continue to keep the updates on these improvements coming. I am very happy that your team is working on coding, process, performance, and redundancy to make the Waze experience more reliable.
Waze Champ
iPhone 5, iOS 6.1.4, Waze 3.6
gettingthere
Waze Champs
 
Posts: 5783
Joined: Fri Nov 05, 2010 5:30 am
Location: Southern California, USA
Has thanked: 4 times
Been thanked: 15 times

Next

Return to Website, Community, General

Who is online

Users browsing this forum: Bing [Bot]