Capitalism does not work by effectively allocating existing resources. It works by effectively creating new solutions to human problems. The genius of capitalism is that it is an evolutionary solution finding system. It rewards people for solving other people’s problems… The more people we include both as entrepreneurs and as customers the best it works.
On November, I gave a talk at the Internationalization & Unicode Conference #38, Santa Clara – CA. There’s no video this time. But, here are the slides, a picture thanks to John Huân (
#paypal) and its transcription (slightly changed to sound better for a post).
jQuery is very popular for its core project create by John Resig that made web development simple and reliable in the dark era of browsers. Developers could use the magical dollar sign and not worry about compatibility anymore.
It happens that we have other projects as well, which are all under the same jQuery foundation umbrella. A really quick intro. UI, which holds a curated set of user interface components and is famous for its calendar widget: datepicker. Mobile, which extends UI with responsiveness and accessibility for smartphone and tablet devices. QUnit, which is a testing framework, we happen to use it ourselves to test our projects. Sizzle, the CSS selector engine inlcuded in jQuery. and Globalize, which is where I am.
My name is Rafael Xavier and I’m Globalize project lead. So, enough introduction, what happened to Globalize?
Everything starts with web developers trying to create a simple web application that needs globalization support. What do we do? Simple. We include libraries that allows us to do so and job’s done.
Everything looks fantastic until we go show our amazing progress to our boss and he says: “this is wrong…”.
When that happens (when we spot a bug) well, we get the chance to know our libraries a little better. Anyway we don’t have much choice other than either (a) figure out what’s wrong and fix it, or (b) file a bug.
At some point, the reviewing process was to look & use what’s in CLDR. So, we asked ourselves why not to replace our database and use CLDR instead? This is when I got first introduced to Globalize. By that time, this bug was opened for 8 months and it wasn’t one, but two big issues we were trying to solve: (a) we were managing the content ourselves, and (b) the source initially chosen, which was .NET, turned out to have big problems.
When I was first digging into this bug, discovering what CLDR was and how we could change Globalize to use it, my initial concern was how that change would affect our locale coverage and our current functionality. But, for my happiness I figured that by adopting it, we would actually double the locale coverage and we would also fix all the functionality issues we had so far. It was simply
Ok! In order to address that whole bunch of bugs at once, we are convinced to rewrite the whole library to conform with CLDR (with TR#35). But, we are also convinced we have a second challange to solve, embedded content. Or tomorrow, we’d end up with unhappy developers again complaining about outdated CLDR or wrong CLDR. We didnt want to simply rename our problem. But, we wanted so solve it.
Before creating something new. Let’s research. What about other libraries? After all, we are not alone in the world. Do they have this problem? If not, how do they solve it?
Let’s start with twitter-cldr. It’s based on CLDR obviously as its name suggests. But, it embeds CLDR data in the code as we unfortunately did as well. Do they get the same sorta bugs we had? Let’s see an example. Developer filed a bug about wrong date formatting in the Italian language, which has been fixed in CLDR 26. But, hasn’t been fixed in twitter-cldr (by the time of this talk). To give a rough idea of the timeline we’re talking about, after 43 days since developer has first reported this bug, it was fixed and published by CLDR. At the day of the talk, it was 91 days and the bug was still opened.
Next, angular.js. It’s based on CLDR. Actually, it pulls data in from Google Closure, which in turn is based on CLDR. Looking its master code (at the day of this talk), it presented the same Italian bug that we just saw.
Here, I’ve picked one existing bug as an example. But, there are more, an easy way to find them is by looking at CLDR changelogs.
Not to mention the bugs that are still open on CLDR. For example, the plural one I used as an example earlier on (shown in my slide). It seems so basic. But, it’s real and it happens today on CLDR 26. But, for the Brazillian Portuguese language.
What usually happens is that we, again speaking as a developer, cannot wait indefinitely for those bugs to get fixed and land upstream ~~boss yelling~~. Occasionally, we fix them locally. Yay. Is it ideal? Some people may say “it works for me”. Although, keeping a local custom modified library has its drawbacks. As soon as a new update pops up and we download it. We lose our previous fixes. Unless we
keep a list of patches that we can re-apply over and over and, hopefully, not get any conflicts while doing so.
This is definitely not a good maintenance perspective over time. So, what could be done instead? At Globalize, we thought a solution should cover three simple points:
1. It should leverage the official CLDR JSON. Because, (a) processing JSON on
2. It should allow developers to load as much or as little data as they need. For example, when formatting numbers, we don’t need the plural rules obviously. We don’t even need lots of the number fields themselves. It depends on a case by case basis.
3. Avoid duplicating data between libraries. If a developer uses two number libraries, one for formatting and another for parsing. For example, ecma-402 polyfil and Globalize. Both should be able to use the same shared number data.
The intriguing question is: if this is so good and obvious, why hasn’t anyone done this yet?
I don’t know the answer to this question. But, there’s one fact I haven’t mentioned yet. The first appearance of a CLDR in the JSON format actually happened on CLDR v23, which happened on 2013. And, by that time all these libraries already existed. So, all have an excuse. But, no more.
Welcome to the new Globalize. It has been rewritten to address all the previous three points. It’s designed to work both in the browser and in Node.js. At jQuery, we systematically test it against desktop and mobile browsers (versions listed in the slide). It supports AMD and CommonJS module loaders.
On the rewrite, we’ve split the former monolithic library into individual modules. On date module, we find date formating/parsing. On number module, we find number formating/parsing. And, so on… All the functional verticals lie onto the Globalize core, which works as a base layer. Note we have no more content embedded into our library. The CLDR content is treated as a peer dependency. Also, note that throughout the Globalize code, we manipulate CLDR: we instantiate locales, we traverse CLDR paths. Some of these operations are not that trivial <likelySubtags example?>. Therefore, we’ve wrapped that code into cldr.js. So, we (a) keep Globalize focused on the i18n functions only, (b) allow other libraries to scaffold and build themselves on top of the same foundation we do. ~~layers topology~~.
Cldr.js is a low level library, whose only purpose is to help to manipulate CLDR. So, it’s really cool to develop your i18n library not needing to worry about that.
Like Globalize, it’s designed to work both in the browser, or in Node.js. It support AMD and CommonJS.
It’s unopiniated on how user should load CLDR data. There are dozen of ways. All we expect is the JSON. So, developer can fetch the data dynamically if he will. He can use AMD plugins. Node.js require. Or any imaginable way.
Globalize uses Cldrjs load method under the hoods, so Globalize works the same way.
It automatically deduces subtags using CLDR algorithms specified by CLDR docs. These variables then are automatically used when traversing the tree to get an item.
Those are the most important features, but there are more.
We’ve talked about the benefits of having CLDR as a separate thing. But, this approach inevitably introduces one extra step for developers: to download the CLDR data themselves. Who likes extra steps?
How can we ease this initial ramp up? More importantly, how does developer know he has the right CLDR version compatible with the libraries he’s using?
Libraries need a way to declare its CLDR peer dependency.
We at jQuery use bower and npm to manage our library dependencies. Can we use this same tools to manage CLDR peer dependency as well?
Yes, we created a package called `cldr-data` for both: npm and bower. It works the lightest possible way we could imagine it. It doesn’t actually mirror any data. But, it manages the zip urls, and as a post-processing step it downloads the right link and unpack it.
If you develop a globalization library, or plan to, consider scaffolding it on top of cldr.js and cldr-data.
Let’s start to see some action. The demo I’m about to show you is going to do three things: (a) fetch globalize, (b) fetch cldr data in the json format and (c) run some Globalize code on Node.js.
We can use npm to install all that. If you are not familiar with npm, it’s the Node.js package manager and is commonly used by Node developers. Installing globalize also installs its both dependencies: cldrjs (the base library) and cldr-data (the node module that pulls in the CLDR data in the json format from the Unicode servers)
We’re all set. Let’s use Globalize. Start using right ahead makes it complain about missing CLDR data.
We developers need to feed Globalize with the proper CLDR JSON files. Here, I’m using cldr-data package. It’s optional. We pass in only the content we need. Oh, but which ones do we need? If you need help figuring it out, see the table in our docs. Then, use it.
Let me show you how easy it is to manage the CLDR content. “User has the power”. Do you remember the italian bug we saw earlier on, which affected twitter-cldr and angular.js? Let’s reproduce it by using CLDR version 25. Note the problem is the usage of `/` (wrong) instead of ` ` (correct).
Do you want to fix it and move to the next level? Yes. Note that using CLDR version 26 is enough to fix the problem. We don’t need to update our JS library or application code.
Do you remember the Brazillian Portuguese plural bug, which is currently present on CLDR 26? Let’s reproduce and fix it? In Brazillian Portuguese, the plural form of 0.5 is not other, but one.
There’s no CLDR 27 yet. But, let’s set the right rules ourselves and retry. Tchara! Fixed! Note, we were able to fix it dynamically on user code.
The same thing works on browsers as well. We can use bower package manager, which is commonly used on client applications.
Please, find examples using these various environments (bower, AMD or node.js) in Globalize docs.
What about performance? I’m advocating a library that traverses the CLDR tree and parse rules dynamically at runtime, which is totally fine during development. But, what about production? Is there any optimization?
When formatting a number, there’s actually a two step process: (a) setup and (b) execution, where setup takes considerably more time (is a lot more expensive than execution). The runtime difference is an order of magnitude.
Note that the same is also true for date formatting, message formatting, or even parsing all of them.
So, an obvious way to speed up iterations in your code is to generate the formatter outside the loop. The same idea is valid for server applications, we probably want our formatters to be created in advance, so when requests arrive, we can process them quickly by simply executing the formatter. Obviously, this is a simple demonstration. Usually, an ICU Message Format goes here instead in a real world application, a number format would be an input for a templating engine like mustache, handlebars. But, all follows the same idea.
Another distinction between the setup and execution is that all CLDR manipulation happens during setup, which can be really handy. For example, any missing CLDR error will be thrown as soon as the server is started in this example. So, all subsequent client requests are safe from any CLDR manipulation error.
Let’s talk a bit about client application. On client side, performance is also about how fast our page loads. So, size matters. How to get the smallest and leanest bundle for production?
Let’s see an example. Suppose we need a plural function for English. This is what we need. A function that given a number, outputs the plural form. But, in order to get that, we need the English plural rules and the library code to parse that rules in order to generate this simple function. In the end, this tiny thing is what matters. What if we could precompile that at build time and deploy only this tiny function. The good news is that it’s possible. All our formatters/parsers output the precompiled function. So, by managing that at
build time, we can remove the need for most of the library at runtime.
The plural function is an extreme example. But, we often can save bytes formatting or parsing other stuff too.
If you are familiar with templating engines like Handlebars, the idea of deploying precompiled bundles may sound familiar.
You know, what’s most important about this whole story is that we didn’t do it all ourselves. We’ve been systematically collecting feedback on every aspects of our idea and our implemention. Part of the solution is contribution from Wikipedia, part from Alex Sexton, part from people all over the internet.
The jQuery foundation has a mission that goes beyond supporting the development of our products. It’s about supporting three things.
Improve the open web, which means, having open standards publicly available and free to implement. ~~An open web~~
Make web accessible for everyone, it means including people with disabilities and people living in poor conditions. ~~An accessible web~~
And, ~~Collaboration with the development community~~.
“One thing that really struck me during my limited research was how many overlapping libraries there are especially for number and date-time formatting”
We have been in contact with lots of people in different organizations. As an effort of improving the coordination and potentially the collaboration between projects, we’ve created a common channel for communication. Hopefully, with the joint effort of this group, we’ll make this whole farm more productive.
It’s a very recent initiative. But, it’s already been used for annoucements of new accomplishments and we’re collaboratively producing a comparison grid to help clarify the differences, the gaps, the strenghts and weekness of each different projects.
All libraries that I’ve showed are functional. They work. So, feel free to use it. I hope you enjoy it. File bugs if you find any trouble. We really listen to what you say. Help us to design, implement, and test it. Collaborate with us. Join us.
Here’s the video…
Other talks of this same conference can be found here http://events.jquery.org/2014/san-diego/
Já devem ter visto que o Google tem um projeto que visa dirigir carros de forma totalmente automatizada. Já passou no jornal e etc mostrando os carros dirigindo sozinhos no meio de São Francisco civilizadamente.
Mas, o que foi novidade para mim foi isso: pilotar! Nada mal
A visao de fora:
TED Talk do Sebastian Thrun:
Recentemente, fui surpreendido ao descobrir que o site da loja de roupas Abercrombie (http://www.abercrombie.com/) envia oficialmente para o Brasil! Aqui vão os prós e contras de minha experiência.
Tempo de envio: MUITO RÁPIDO!!! Comprei no domingo, chegou em casa (interior do estado de São Paulo) na 5a-feira. Exatamente! Demorou apenas 4 dias úteis. Veio pelo FedEx.
Preço: SALGADO! O valor do imposto é praticamente 100% do valor do produto. Há a taxa de 60% de importação + o chupa cabríssimo ICMS de “18%” + desembaraço aduaneiro. Coloquei o 18% entre aspas, porque o ICMS é, além de ser cobrado em cima do valor total (valor declarado + taxa de importação), cobrado “duas vezes” pelo seguinte cálculo:
No estado de SP, o ICMS é de “18%”. O “18%” nominal vira quase 22% real, e na prática (sobre o valor importado) acaba sendo 35%:
(valor declarado + imposto de importação) / (1 - 0,18) * 0,18 =
1,6 * valor declarado * 0,22 =
valor declarado * 0,35
Esta forma de calcular chama-se cálculo por dentro. Há uma boa explicação neste PDF.
Então, juntem os impostos com o desembaraço aduaneiro e considerem uma taxa total de praticamente 100% sobre o valor comprado.
Post de Cezar Taurion que retrata muito bem o cenário Open Source e seus modelos de negócio.
Post original na íntegra.
Volta e meia dou entrevistas para a mídia falando de Open Source. E uma das perguntas mais frequentes é sobre “Quanto a IBM obtém de receita com Open Source?’. Ora, quando se fala no modelo tradicional de comercialização de softwares, esta pergunta tem uma resposta fácil: basta ver quantas licenças foram comercializadas e qual o preço médio delas. Mas, com Open Source é diferente. É muito difícil capturar com precisão o volume de receitas. Muito da receita de Open Source é obtido de forma indireta. Um exemplo é o Google que fornece gratuitamente softwares como Android e outros, para alavancar receita com anúncios. Vejamos também a IBM, que apoia diversos projetos como o Linux, Eclipse e outros, alavancando receitas indiretas, como mais servidores, serviços e mesmo softwares não Open Source.
Portanto, precisamos reformular a questão. A receita direta e contabilizável de um determinado software Open Source não implica em medida direta do sucesso ou fracasso do seu impacto econômico. Devemos, para esta análise, olhar o ecossistema como um todo. Um engano comum é comparar as receitas de um determinado software comercializado na modalidade de vendas de licenças com a receita obtida pelos distribuidores de softwares Open Source. Ora as distribuições pagas de softwares Open Source, de maneira similar ao SaaS, são comercializados pelo modelo de negócios de assinaturas, com a receita sendo distribuída ao longo de vários anos e não concentrada em um único pagamento. Assim, comparar receitas obtidas por modelos de negócio diferentes é comparar laranjas com bananas.
Além disso, não existe correlação entre a receita direta obtida com determinado software e seu uso pela pela sociedade. É muito difícil medir com precisão o uso de um software Open Source. Podemos contabilizar os downloads registrados a partir de um determinado site associado ao software em questão. Mas, a partir daí, como é perfeitamente possivel e até mesmo incentivada sua livre distribuição, fica difícil contabilizar as inúmeras outras cópias que circularão pela Web.
Entretanto, é indiscutível que Open Source está se disseminando rapidamente. Os seus principais apelos para o mercado são bastante motivadores: não demanda desembolso prévio para licença de uso (troca capex ou custo de capital por opex, que é custo operacional), menor custo de propriedade, evita aprisonamento forçado por parte de fornecedores e maior facilidade de customização, pelo livre acesso ao código fonte. Também observamos que sua disseminação não é homogênea por todos os segmentos de software. Sua utilização é muito mais ampla em sistemas operacionais, web servers e bancos de dados, mas ainda restrita em outros setores, como ERP e business intelligence.
Mas, Open Source não cresce não apenas no campo do uso tradicional de software, que são os aplicativos comerciais. Vemos sua disseminação se acelerando à medida que a Web se dissemina (muito do código que existe rodando na Web 2.0 e redes sociais é baseado em linguagens dinâmicas em Open Source como PHP, Python e Ruby) e veremos muito código Open Source sendo a base de sensores, atuadores, set top boxes da TV digital, netbooks, celulares e outros novos equipamentos. Open Source também está na base tecnológica de muitas infraestruturas de cloud computing.
… +veja o artigo na íntegra aqui.
Real time traffic using people collaborative work (crowdsourcing):
See live traffic of Seatlle here.
Google is the searching engine leader today. In the meanwhile, Yahoo and Microsoft just made a deal to fight together for the about 20% left over market. Although it’s still a billionary market, a new searching era is comming (actually, it’s already here).
Jeff Jarvis commented it out on the Twit TV show presented by Leo Laporte — which can be watched here http://odtv.me/2009/08/twig-1. He said:
Today you have to distribuite your stuff all across the web, you can no more expect people to come to you. So, the notion of having a home page like in 1999, where the whole world comes to you through google, may be passed. Just look at Google Wave, Google Elements, and thinks like that, it’s about distributing yourself and having your audience distributing you everywhere and that pasts search to the new means of discovery. Twitter is one of them.
Pay attention on how the social web influences you on filtering information out and helping you to search for the right thing.
Você costuma viajar?
Se você estuda fora, trabalha em outra cidade, namora, gosta de passear, enfim, independentemente do motivo prefere dividir uma carona com seus amigos do que viajar sozinho, isto pode te interessar…
A internet hoje nos oferece algumas alternativas para nos aproximar de grupos de pessoas com os mesmos interesses. No caso de dividir carona, existem comunidades no orkut, grupos de e-mails, ou mesmo alguns sites onde podemos postar nosso interesse em oferecer (ou procurar por) carona.
Procurar carona através de comunidades no orkut é rápido e fácil, no entanto existe o revés da segurança e confiabilidade nas pessoas que as oferecem, pois são abertas (públicas) demais.
Grupos de e-mail, oferecem segurança ao reunir pessoas com um certo vínculo entre si, mas trazem o revés do tempo inicial de estabelecimento no grupo e rigidez em seu itinerário. O usuário precisa ter o conhecimento prévio do grupo, pedir sua inclusão ao moderador e, após tudo estabelecido, limita-se a um certo itinerário específico, geralmente carona entre duas cidades.
Sobre sites para carona, apresento-lhes o Simbora.
O Simbora reúne:
- A segurança e a confiabilidade dos grupos de e-mail, pois ele é baseado no modelo de redes sociais. Assim, apenas pessoas próximas visualizam caronas entre si.
- Facilidade, pois basta digitar De e Para e, zap, as caronas de seus amigos aparecem. Se você, amanhã, for pra uma outra cidade, basta digitar o nome desta na busca ao invés de ter que descobrir um novo grupo de e-mails.
- A riqueza de um website próprio pra este fim. O Simbora é um mashup e utiliza alguns serviços como o Google Maps para enriquecer os resultados. Portanto, visualize o itinerário, distância, tempo. Veja quem está mais próximo de você. Uma pessoa saindo de São Paulo é algo um tanto genérico!
- Cobertura. O Simbora tem plug-ins que o alimentam automaticamente, por exemplo há o leitor automático (parser) de e-mails trocados entre os grupos de carona. Desta forma, o Simbora se mantém atualizado (sincronizado com as demais fontes).
A primeira versão disponibilizada ainda é um rascunho (versão alpha), mas já funciona. Basta acessar: