How hard can it be? (Or why you don’t have custom per contact ringtones on Maemo)


Often in blogs, forums or IRC you can find people complaining of missing features in some programs (and some of them are very rude). While they can be right sometimes, other times they just make me angry because they don’t know how difficult writing software can be, and they don’t understand the difference between a semi-working prototype and a proper stable application written by professional developers, designed by professional UI designers and tested by professional testers.

Implementing some features can actually be quite difficult and it could be better to skip those from your product and focus on other things; on the N900 one of these missing features is the ability to set customised ringtones for specific contacts.
Several people wondered how hard it can be, after all a lot of old phones do it. What they don’t consider is that, in many ways, the N900 is not a traditional phone and is more similar to a small computer. On the other hand, the N900 still needs to be reliable to be certified as a phone; for ringtones this means that the ringtone should be played as soon as the phone call is received, or the user could miss it.
Now suppose your N900 is under heavy load due to multitasking (real multitasking, like on a normal computer) and you receive a phone call from a friend; being a close friend that often calls you, you have an MP3 ringtone set just for him. The phone has to look up for the contact corresponding to the phone number, load the file from the (slow) memory card, load the libraries for playing the ringtone, uncompress the file, and finally play it. All of this on a phone under heavy load with most programs swapped out of memory!
To workaround this problem the N900 seems to do some tricks: the ringtone is uncompressed into a (big) WAV file and saved on the faster (but small) internal memory, and the component playing the ringtone is memlocked (i.e. never removed from memory). Of course, you cannot do this for all the possible ringtones or the already small disk space would be used immediately. Choosing not to uncompress the files, on the other hand, would mean keeping loaded in memory all the possible codecs.

Does this mean that it’s impossible to have a different ringtone for a specific contact on Maemo? No, it just means that if you want it you have to be ready to accept that the ringtone could start playing a couple of seconds later in some uncommon heavy load conditions. When you are ready to do that you just have to wait a couple of days, so that I can polish and publish the program I wrote to have custom ringtones :D

In other news, I’m going to GUADEC for the whole week: see you there!

I'm going to GUADEC

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts

Reader Comments

Very interesting. I wouldn’t have thought of this being a (possible) reason for omitting the “ringtone per contact” feature. Makes sense now that I read it, though.

Those excuses for lack of very basic functionality are very lame:

The phone program needs to lookup the contact corresponding to the phone number anyway, since it displays name and image.

It needs to read the wav (which is possibly much larger than any mp3) not from some internal fast memory as you falsely claim, but from eMMC which the probable source of the putative MP3 anyway.
Considering you have 30 GB of storage, and that the phone program seems to leave wavs of old ringtones any way, claiming that lack of space is the reason for not having multiple ringtones sounds very incorrect.

[...] resposta a uma pergunta que muitos donos de N900: porque não é permitido aos contatos terem ringtones personalizados? E o mxr.maemo.org, uma ferramenta de referência cruzada de código para o Maemo, foi [...]

@Matan: You just prove Marco’s point.

The call-ui is not memory locked (only the part that rings). So it can start ringing much before it actually loads the contacts.

Reading the wave file is not so much of an issue as loading GStreamer, libosso-abook, etc that are required to pick a per-contact ring tone and decode it. And loading them isn’t really the problem. It’s getting enough RAM by pushing other applications down to swap on the slow eMMC. The whole process can take a couple seconds, enough to miss a call (and enough to not be acceptable as a phone).

The decoded wave file is on the fast 256MB chip, not on the slower 32GB eMMC. There is only one decoded wave file at a time (it always has the same filename.. so can’t leave old ones).

@Matan:
You don’t know what you are talking about.
The phone application is a separate process that is not memlocked. If the UI appears one second later it’s ok, but the ringtone has to start ASAP to attract your attention.
Uncompressed ringtones are not in the same partition as the MP3s.

Interesting points. I thought the uncompressed MP3 was stored in /home, which *is* on the eMMC – or has it moved in the last couple of releases?

Software is hard (especially semi-realtime, multi-threaded, multi-tasking, resource constrained software). But the “you might miss a call if it takes a couple of seconds” argument isn’t of particular comfort: yes, it may start ringing quickly (although I missed a phone call today because it didn’t ring in time), but the PR1.2 seems to handle mid-load levels particularly badly (unresponsive touch screen with two Google search result pages open over 3G, and Media Player playing 192kbps MP3s over A2DP Bluetooth headphones).

Displaying the call UI, and responding to any input at all, can take 3-5 seconds (IME); if missing the call due to delayed ringing is a possibility, it’s a possibility by not being able to accept the call either.

The technical problems you outline, use-case analysis and the limited development resource and time (and, undoubtedly the hallowed ‘UI Specification’) all contribute – no doubt – to the lack of this feature. It doesn’t bother me at all.

However, this post just happened to show up at a time when I literally (honest) considered throwing my N900 out of the train on the way home because of it’s performance (top didn’t reveal anything sapping the CPU, load average was low and median idleness was >80% whilst top was open. Not playing music over BT headphones helped, but if I’m going to constrain my usage habits to match the actual capabilities of the device, why not get a better device? :-()

Sorry for the mini-rant and off-topic rambling!

Ah, /home/user/.user-ringtone just contains the filename of the original file. Where’s the uncompressed WAV held then?

Hm, you are actually right. The uncompressed wave is in ~/.local/share/sounds, that is the same chip as the big FAT MMC, but on the ext3 part (that is never unmounted, while the FAT part is unmounted when in mass storage mode).

OK, maybe custom ringtones per contact is hard. But iOS and Android can do it. Windows Mobile and Symbian can do it. (My Nokia N95 from 2007 can do it!) If Nokia, Intel, and the Maemo/MeeGo community expect MeeGo to be taken seriously, complaints about something being difficult that all the other smartphone platforms can do (both “open” and “closed”) have to stop.

@AJ:
The iPhone OS and Android don’t offer true multitasking, so the comparison wouldn’t be valid.

I don’t know/care about Windows, but Symbian… doesn’t Symbian do true multitasking the way Maemo does? How do they solve the issue with Symbian then? And why wouldn’t whatever they do there work on the N900?

@AJ:
As Oskar said iPhone and Android don’t do real computer-like multi-tasking.
Symbian has a different model for memory management that I don’t know in details, but on Symbian most programs just start crashing (they shouldn’t but they do) and the OS complains about lack of memory. AFAIK there is no swap, so there is no slow down.

@All:
Note that I didn’t work on this area, so I’m don’t know any detail about this and it’s all speculation. Everything I discovered was discovered because I implemented custom ringtones by myself.

Symbian does allow setting per-contact ringtones.

On the other hand, I did not have any per-contact ringtones and sometimes my Nokia 5800XM started to ring not my chosen tone, but the nokia tune. I’m guessing because fetching the real tone was taking too long.

I much rather have a *working* simple tone system than a complex one that sometimes starts playing the nokia tune and you don’t answer the call because you don’t think it’s your phone ringing.

This was another extremely valuable blogpost to close the gap a bit more between developers (knowing technical backgrounds&limitations) and users (“should be trivial to implement”.)
Thanks a lot for it.

I really don’t believe thats a valid argument — that it would take several seconds to load the ringtone. If I put my phone under heavy load and then open a terminal and type “mplayer foo.mp3″ the sound starts almost immediately. If the ringtone-generating process was reniced that would help even more.

Also — my MP3s are 2-3MBs. Given that I don’t believe you need the whole MP3 file to start playing it would probably take the ‘slow’ memory maybe half a second to load that? If thats too long then It’s still probably pretty simple to auto-downsample and chop selected ringtones and store them on the faster card.

Given that phones with less than 100Mhz processors and even slower memory devices can pull this off, a hand wavy argument about swapping isn’t going to convince me that the N900 is too slow to do ringtone per contact. Maybe some hard numbers and experiments might, but then if we go through all that effort to prove that a 600Mhz phone is too slow to do something a 100Mhz device can do something else must be wrong!

@ZachGoldberg:
As I said I didn’t work on it and I don’t know the details, but I’m sure that the people working on it actually did multiple tests.
About the heavy load, if the N900 (or even my laptop) is swapping a lot then it can become really slow and starting new applications take really a lot of time.

If the fundamental problem is swapping out stuff from RAM to make room for the phone then other solutions must be possible. For example, one could always keep the entire phone & contacts app in memory. I can’t imagine they consume that much, and having the device act as a good phone is pretty important and for most people worth sacrificing the 5MB of free memory.

It seems like this problem is less that the Maemo folks couldn’t find a good way to do it (because I firmly believe this device and software stack could do it no sweat, performance wise) and more that they just didn’t have the time.

@ZachGoldberg:
Memlocking the contact stuff means memlocking basically any library used in the Maemo stack.

I don’t buy it.

1) You’ve already been contradicted on the WAV file.
2) Every other phone OS can support this, even Symbian, which ran on devices with substantially less RAM and MMC available.

If the phone-app has CPU priority, and it seems to, then getting I/O time to read() MP3s, FLAC, or OGG files should not take more than a few milliseconds. Has anyone at Nokia even bothered to write a test case that institutes per-contact ringtones and seen these “horrific” response times that you want to claim?

I’m almost half-tempted to write a benchmark app just for Maemo to time read() speeds while performing other load-bearing operations (CPU and I/O).

@Michael Cronenworth:
As I said Symbian works in a different way.
Knowing how Maemo development works I’m sure they did tests on the performance of various things; new stuff is not integrated if not working well enough.

Also, did you all understand that I actually wrote a program to get per contact ringtones? The fact that the are so many people just whining makes me real want to spend my free time on this…

@Michael: Marco’s actually written and implemented this (and done it properly, not a half-arsed hack). To the best of my knowledge, you haven’t. So, I’m inclined to trust Marco a great deal more than I am a random drive-by commenter whose name I have never seen next to any Maemo code whatsoever saying ‘oh, that can’t be true, all the evidence to the contrary must be a lie because it doesn’t sound right’.

To everyone proposing mlocking the world: GStreamer and all its dependencies, GTK, Clutter, libGL, D-Bus, libX11 and all its dependencies, etc, etc. It’s not really that small at all.

@barisione, @daniels
I don’t know either of you either, nor what code you have written. I could care less about what you know about me with your elitest attitudes. Take a chill pill.

I will gladly wait to see what barisione comes up with. I will appreciate any work he will do. Please don’t misunderstand me as a troll. I am far from it.

Wow I learned a lot reading this blogpost and then the discussion below. Even though at times a bit flamy, it held some interesting points.

And I can’t wait for your application that handles multiple different ringtones for contacts! :)

All right, ringtones are not so actual.
But Contacts in general are made poorly.
Yes, I understand, this first device which at last has united all medias in one contact. But really who did not try to test “really” operation with contacts?
How it is possible to work with any adequate quantity of contacts without grouping? Already even at 2-3 hundred persons to understand who from them who it is very difficult.
How it is possible not to provide possibility of output in the list not only “name lastname” but also nickname and the companie? Besides, already at two-three hundreds contacts there is a set double and more coincidences “name lastname” and what to understand who from them who – it is necessary to enter into each of contacts!
Such impression that with contacts did not check operation before implantation. And that daily prevents to enjoy me the excellent device n900.
(sorry, machine translation)

@Eugeny:
The contacts application as been implemented with the feature set it has and not something different because the UI designers thought it matched the needs of the target customers.
What you ask for would not have been too difficult to implement, but you also have to consider that when you start a new platform from scratch you cannot implement everything. As any developer knows (or should know?) just adding too many developers to a team is not going to improve what the team produces.

Good news,

But I am waiting for the possibility to group contacts for SMS, MMS, Ringtones and more – this is much more needed than individual ringtones per person. Is there any news around on this? Anybode working to solve this?

Kris

@barisione
I understand that it is impossible to make at once much.
But has passed already almost year from release n900, it is possible to make already contacts more usable?

The core that does not suffice me – is a variant of the list “last ‘nickname’ first [firm]“.
The second for importance – possibility of any grouping.

P.S. By the way, I cannot still find a way to sync all fields of contacts to any external service or the program. So that there would be gtalk fields, jabber(s), skype, etc. There is any variant of _complete_ synchronisation? I try pc suite, ovi suite, funambol – and all of it lost some filds.

@kris:
I have some plans for it, but I’m not sure if I will have time for it. Some more details in tomorrow’s blog post.

@Eugeny:
The problem with sync is that those fields are not supported by any service as far as I know :(

What about playing an uncompressed default ringtone as a kind of “intro” at once and do some fade out, fade in as soon as the customized ringtone and all its dependencies have been loaded in the background? Or: After the state of CPU/SWAP have been checked?
This could combine the effects of knowing “my phone ringing” with bypassing the first few critical seconds and knowing who’s calling.
Just an idea, please let me know if it is stupid nonsense.
Keep up your great work!

@Achim:
It’s not stupid, my old landline phone was doing the same as getting the caller ID takes a while on landlines.
IMO it’s not a proper solution, most of the times it would just play for half a second and the effect would sound quite bad.

@barisione

No, The Problem is that I can loose my belowed contact filds!
a) Sync device, reflash, sync again – no fields.
b) sync device, modify contact from other client, sync – no fields.

Sync must place my data in any suitable filds on server, for example come “user notes” or other rare usable, but not lost!

N.B. And how hard is to implement more lists type, like “last ‘nickname’ first” and groups?

The default sync apps are closed source and I never had access to the code, so I don’t have any idea on how they work.
The open source ones just implement a subset of the fields and drop the others :(

Implementing a different display name in the address book is probably not difficult, but it’s impossible without touching the closed source libosso-abook.

@barisione
Stupidity everywhere. :(
Great mobile device with killer-feature “all in one contacts” is unusable exactly in contacts aplication. I have only about 450 persons, and blame contats creators every single day. All other functionality exceed all my needs.

I just hoped for a firmware update, but judging by what you say is useless.

To be honest, I lose around 1 call a week, just because someone calls while my N900 is checking for updates or does anything at all really. In that case:

1) screen backlight comes on
2) after 2-3 secs inactive popup comes up in landscape mode
3) it takes 1 sec to switch to half-screen popup in portrait mode
4) it takes another 1-2 secs to switch to full-screen portrait mode
5) it takes another couple of seconds before the screen starts responding to me clicking the “answer” button.

Can we just assume that the basic calls are utterly broken anyways? As much as I agree that custom ringtones could make it any slower. “N900 still needs to be reliable to be certified as a phone” – I’m amazed it is. N900 is a half-working prototype itself.

Very interesting reading. Thank you for your efforts and insight.

[...] with something more complex could make the ringtone start slightly later in case of heavy load, see my previous blog post. You have been warned [...]

Michael@21: I could care less about what you know about me with your elitest attitudes.

I hate to display another supposedly elitist attitude, but “I could care less” doesn’t make any sense. David Mitchell relays a message from the Queen on this important topic.

(Of course, Language Log thinks I’m in a minority here. But it’s a logically consistent minority!)

I am quite pleased to see that eventually somebody admits that the N900 is not a real telephone but a semi-working prototype and does not contains ‘proper stable application written by professional developers, designed by professional UI designers and tested by professional testers.’ BTW, also without loading ring tones from a storage card my N900 does not reliably pick up calls.

Regards.

Paul

[...] with something more complex could make the ringtone start slightly later in case of heavy load, see my previous blog post. You have been warned [...]

Ad posting 33 by Viraptor.

This happens to me at least once a day.

Cheers.

H.

@Paul Lahner:
If you could read you would notice that I said the opposite.

Please stop defending the N900.

People who really use it knows that the software have a large number of defects. I don’t care if people work for Nokia or not, but the software is seriously lacking – both as in missing features and in being broken.

For example – who likes a phone that sends the battery warning beep to the BT handsfree I have hanging around my neck? Or a phone that may just stop playing music in the BT handsfree and when I pick up the unit and look at the display I notice that it is calling. And I regularly miss calls because of the slow UI – possibly switching the orientation making me press reject instead of accept.

Anyway – back to the topic of this article.

No, I can’t really see a problem having full support for individual ring signals or individual behaviour. The Linux OS allows more cool things to be added to the unit, but it does not remove the possibility for standard phone features – yes having contacts grouped and with individual ring signals is standard and the lack is a big Nokia disgrace.

It seems to be the conclusion that the ring signal is uncompressed WAV data stored on the 32GB eMMC memory.

It should be reasonably simple to convert first minute of a large number of ring signals/soundtracks into WAV files. If having full CD quality, it takes 176kB/second for stereo. 100 signals (yes, extreme) times 1 minute means about 1GB. But I would bet that 20 individual ring signals would be more reasonable at 200MB. Half if in mono. Suddenly no need for a lot of codecs for custom ring signals of varying format. The conversion can be done way before the ring signal is needed.

If we assume a max of 255 ring signals, all signals named /.ring and 500 numbers with a maximum of 24 digits each, a mapping from phone number to ring signal can easily fit in 16kB with space left for data structure optimizations.

Code to map phone number to ring signal is in the kB range.

Preallocate a ring buffer of 450 kB for 3 seconds.

The phone already seems to have support to play the ring signal, and the ring signal seems to already exist in the eMMC memory. So there seem to already exist support to read the ring signal from file. It is however not proven if the ring signal is loaded on each call, or if the first part is preloaded.

Anyway – I would be very surprised if not 1MB of prelocked memory would manage individual ring signals at least as well as the current solution manages the only signal.

And without increasing the memory needs, it would be trivial to add more extensions.

When using a handsfree, I don’t want a ring signal – I would prefer the name of the caller, just as I had in the N95. Prerecording 3 seconds times 500 contacts would be 250MB in CD-quality stereo or 125MB in mono. Settling for 8-bit ulaw (easy to on-the-fly upgrade to 16-bit wav if needed) would reduce the needs even more.

And it would require but a few kB to store access rights for the contacts.

- If a contact may call through when busy in a meeting.
- If a contact may only call on certain hours, getting a disconnect and optionally an SMS: “Support is open 07:00 to 18:00. For critical issues, call three times within 3 minutes – but if I deem it not critical, I will paint your house pink, or make your daughter pregnant.”
- If call should be “downgraded” to IM.

@pwm:
Thanks for exactly proving the points in my post.

I understand that it is very hard to programm such things.. And I’m happy that you do it ;-)

There are many reasons, to have custom ringing per contact. I personally have even a more complicated use for it. Not only a special rington, for me there should be some emergency configuration.
so even if I have all sounds muted, The call of ONE special number should be ringing as loud as possible.

I think that is even harder to program. (But if you could to that.. WOW ;-) )

@Chris
“I think that is even harder to program. (But if you could to that.. WOW ;-) )”

No, it is not hard to have code that performs tens or even hundreds of decisions based on the callier ID, time of day, … The only really hard part is to create a good UI where the user can configure all these featuers in a way that they will clearly see and understand what will happen and when.

How do you present what will happen if:
- that specific number calls when you are in a meeting?
- that number calls during office hours?
- that number calls evenings/weekends?

- that number calls while you are busy using voice chat?

When mixing a voice modem with a computer, it is possible to implement almost anything – even a message box directly in the phone, or voice prompts and use of DTMF digits to give a password for critical calls.

The sky is the limit, but the end result may be a big failure because people don’t understand how to use the features. The hardest job for a product owner is to decide what features to _not_ add to a product, just to make sure it doesn’t get too complex for the end user (or for the support line).

It’s always the UI that is hard to make easy to use. To implement the decision logic is quite simple, as long as the specification has clearly defined the priority (to be duplicated in the manual and hopefully visibly in the UI) between different rules.

To everyone saying that contact groups is a standard feature that literally every phone has ever and is essential: the iPhone doesn’t have groups.

@daniels
It’s no matter standart it or not. You just can’t manage your contacts if there are more than few hundred without grouping.

Crochik Mycontact group ringtone worked perfectly. There was no lag or that would bother me. I tested multiple of time under various load and it seems to ring right after the 2nd ring on the other end.

I converted all my ringtone to .wav. 5mb or less per ringtone.

I’m just waiting for Crochik to link SMS to group and I will be happy.

@Eugeny:
What’s the problem with managing contacts if they are more than some hundreds?

[Edit: serious question, I'm not trolling. It's just that if I want to implement it I need to know what people want]

@barisione
The main problem is that I can not remember and identify all the people in my contact list only by name+lastname. There are people from companies, which I can not remember by name, but I remember the name of the company (right now i can’t find this people starting type companyname). There are many people whom I remember on nicknames. My list have up to three people with identical first and last name.

In general, only “name lastname” too little, for correct identification.
Must be something else – the groups, company names, nicknames (with the name and surname, such as winmobile “Last ‘nickname’ First”).

@Eugeny:
That would be solved by a smarter search. That’s one of the things I would like to implement (but I’m not sure if I will ever have time for it).

@barisione

i think only content search not help. for example, if i merge contacts i got only short list, and cont figure what person merge to if it three identical “first last”.

There can always be found examples of products missing some specific features. But that can’t be used as an example why a feature isn’t needed or expected.

One reason for groups it that you may hundreds of customers in the phone. You don’t really know the people so when you get the call, the name may not be enough to understand who is calling – you may need some way to hear or see a company name.

If getting support calls, you may need to be able to separate customers who have flat-rate support and customers who have to pay for each call, in which case you would like color, ring signal or text to give some feedback. And the flat-rate support may be allowed to call through during weekends, while the pay-per-call customers only has 8-17 support hours. Some companies have multiple support numbers that gets routed – some companies don’t.

With lots of people, you will get multiple name collisions – how do you separate your friend “John Anderson” from customer “John Anderson” or that evilishly boring sales person “John Anderson” who always wants you to replace your copier with his brand?

Children playing football? You may have 20 numbers to other parents in the football team. You don’t know half of them. How do you recognize that one of them is calling? Or how do you find a list of football parents, in case you need to call someone but can’t remember the name until you see it?

Some numbers you may add for short-term use. You want to be able to locate the names and garbage-collect them a month after the last incomming or outgoing call – it may be a number you got on a sticky on your monitor about someone to call to solve a problem.

In the end, a phone book is way more than a list of names and numbers. It is one of several organizers used to keep control of your business and private life.

The ability to group people, and to receive a call while seeing not just the name but some associated information such as a company name or a free-text field before picking up a call is a very useful feature. In some situations, controllable ring signals can help with some of these tasks – for example separating family from friends, from priority customers from unprioritized customers.

That is why groups and multiple ring signals is one of the basic features of a phone.

A phone that is a computer should obviously be able to do more than just switch ring signal.

It should for example be able to integrate contact information with calendar information so that you may directly see your next planned meeting, or what you did agree about during your previous call and/or meeting. “A yes, but let’s focus on this question next week, since I will visit you on tuesday morning… By the way – are you happy with the software changes I sent you after our last call?”

When “only” having a phone, it may be natural to have your computer as PIM – or maybe a large and heavy paper/skin edition. Having a computer that is also a phone, you really expect it to manage to do a reasonably good job.

The N900 ends up short both as phone and as a computer. The hardware is capable enough, but the base applications are each on it’s own too lite. And they don’t merge together to form a unified product.

With 100% public code, third party code could add the missing features, or release versions that better merges different features. Right now, we are seeing islands of applications solving small niche problems but that fails with the integration since integration requires replacing basic features with open-source alternatives.

Palm failed badly trying to move into mobile phones. Most mobile phone vendors have failed badly releasing PIM solutions.

The N900 is a platform that could merge the two, except for the time needed to replace all parts with open-source alternatives. Nokia probably underestimated the amount of work it would take to move their functionality from Symbian. And since Nokia seems to have moved their focus in other directions, that work now has to be recreated once more by volunteers.

Swap is one of the main reasons why this is complicated and why PC’s and other general purpose computing devices can’t guarantee responsiveness even with the latest of CPUs. A LOT of effort is required to ensure “always on” experience that mimics traditional phones in a “best effort” environment of a generic computing device.

[...] with something more complex could make the ringtone start slightly later in case of heavy load, see my previous blog post. You have been warned [...]

Thank you for your efforts and the wonderful work

But I haved a problem I hope to solved or gave me advies for this

whan some one call me im my N900 it’s ringing after 4 – 5 second not direct,and if some whin make a call in 2 or 3 second my phone not ringing and show me miss call !!

any advis or future solutions for this problem ??

regard