Feeds

104649 items (96858 unread) in 19 feeds

Friends Friends
Build Build
Heads Heads
News News
fun fun

Items by josh

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Josh

2 Star

Comments for Josh the Outspoken

2 Star

Josh

2 Star

Josh

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Josh

2 Star

Comments for Josh the Outspoken

2 Star

Josh

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

2 Star

Comments for Josh the Outspoken

Josh

2 Star

Comments for Josh the Outspoken

2 Star

Josh

  • Permalink for 'Josh/2008/11/16/habes___2008_11_16T09_34_00'

    habes @ 2008-11-16T09:34:00

    Posted: November 16th, 2008, 10:45am MST by Josh
    I went to an anti-prop-8 rally and parade today, and it was so awesome. It was just full of love and positive energy. For a bunch of people who just got a royal "fuck you" from the state of California, it was a surprisingly upbeat event.

    The weather was beautiful, and from what I hear was even more stunning in San Francisco, which also had a rally. God clearly voted today, but unfortunately s/he is not a citizen of the state of California.

    I climbed many trees and other city habitat to get good photos (feeling a bit like Zaccheus in the process), temporarily putting myself in more danger of falling than I normally would. But it all turned out ok.

    From Marriage Equality March


    From Marriage Equality March


    From Marriage Equality March


    From Marriage Equality March


    My favorite sign:

    From Marriage Equality March


    The only people I could find who weren't having a good time:

    From Marriage Equality March
  • Permalink for 'Josh/2008/10/23/habes___2008_10_23T09_32_00'

    habes @ 2008-10-23T09:32:00

    Posted: October 23rd, 2008, 10:37am MDT by Josh
    Check out my pictures from Phil and Mary's wedding. They turned out awesome, thanks in no small part to this snazzy $1500 lens that I rented for $30. I never could have gotten those low-light portrait shots without it.
  • Permalink for 'Josh/2008/10/09/habes___2008_10_09T10_54_00'

    habes @ 2008-10-09T10:54:00

    Posted: October 9th, 2008, 12:25pm MDT by Josh
    Whoa this is getting ugly from the Republican side:
    At rallies this week in Florida, crowds jeered and taunted members of the news media. One man hurled a racial epithet at a black television crewman, telling him, "Sit down, boy".

    Yesterday, for the second time in three days, a speaker at a McCain rally in Pennsylvania referred to Obama's middle name, Hussein, in an effort to cast doubt in his religion and background. Obama is a Christian.

    At the same rally, shouts of "terrorist" and "liar" could be heard following references to the Democratic candidate. On Saturday, McCain running mate Sarah Palin sought to link Obama to Ayers. "Kill him!" one man in the crowd shouted, not specifying who.

    --The Guardian
    Holy crap, it's the two minutes hate! It is sad and truly frightening that this is what the Republican party has come to. They have turned themselves into the party of xenophobia and hate.

    They are spreading insinuation about Obama's link to a 1960s "domestic terrorist" without recognizing that they are breeding a new brand of extremism themselves! If guys at their rallies are shouting "terrorist," "liar," and "kill him," how far are they from actually doing something violent? The McCain campaign is taking a scorched earth strategy here, which is dishonorable and wrong.

    One lesson I take from this is that no matter how decent a guy is when he's nominated to the Republican ticket, he'll never survive the experience with his decency intact. Because winning on the Republican side requires "mobilizing the base," and the base demands red meat. The base thrives on culture wars. The base has this need to find an "other" and demonize them.

    It's a dark moment on the Republican side of the aisle.
  • Permalink for 'Josh/2008/10/05/habes___2008_10_05T12_56_00'

    habes @ 2008-10-05T12:56:00

    Posted: October 5th, 2008, 2:10pm MDT by Josh
    Dear Sarah Palin,

    You want to talk about questionable associations? How about addressing a political party whose founder openly hates America, and who died buying plastic explosives on the black market?

  • Permalink for 'Josh/2008/10/03/video_candy'

    video candy

    Posted: October 3rd, 2008, 1:07am MDT by Josh
    Stop winking at me Sarah!!



    Yep, you really said all those things:


    Why does Senator Obama believe that McCain and Bush agree on almost everything?


    Finally, FOXNews attacks Obama with the same attacks they used on Kerry:
  • Permalink for 'Josh/2008/10/01/habes___2008_10_01T10_26_00'

    habes @ 2008-10-01T10:26:00

    Posted: October 1st, 2008, 11:36am MDT by Josh
    Check out this really interesting video from the New York Times about the debating styles of Sarah Palin and Joe Biden. It shows some highlights of their previous debates (Palin for Governor of Alaska, Biden in the primaries).

    One thing to take away from this is that despite her embarrassing performance over the last week or so, Sarah Palin has in the past been a formidable opponent who has a maddening way of being being cutting and aggressive in a way that viewers love. The LA Times article Underestimate Palin at your own risk, former rivals say comes to the same conclusion. That article ends with:
    "When you try to prove she doesn't know anything, you lose, because audiences are enraptured by her," Halcro said. "And her biting comments give you a sense of how competitive she is. Anybody who doesn't take her seriously does so at their peril."
    So playing the expectations game that the campaigns love to play, I'd say that we should expect to see an aggressive and in-control Sarah Palin, judging by her previous performances.
  • Permalink for 'Josh/2008/09/24/habes___2008_09_24T09_32_00'

    habes @ 2008-09-24T09:32:00

    Posted: September 24th, 2008, 1:04pm MDT by Josh
    Not quite two weeks ago I stepped on an airplane that would eventually (after two layovers) take me to Montpellier, France, a town on France's Mediterranean coast halfway between Spain and Italy. The occasion was the Renaissance Consort Workshop, a week-long course of singing Renaissance Polyphony one on a part. The course is hosted in a small town in the south of France called Roujan, which is not particularly close to anything, but is full of charm and is not at all catered to tourists. This part of France is solidly wine country, which is abundantly clear not only from the rows of grapes you see everywhere, but also the tractors that are frequently driving through the streets pulling carts full of harvested grapes behind them.



    Though the picture doesn't show it, the building where this tractor is depositing its grapes is the same building where the finished wine is being sold. This single building quite literally takes grapes out of a tractor's cart on one end and sells them as bottles of wine on the other!

    The house in Roujan where the course is hosted is called La Maison Verte, and it is owned by Anne and Fran who use it to host courses like the one I attended. La Maison Verte doesn't particularly stand out from the rest of Roujan when viewed from the street, but once you get inside you're transported into a sort of fairy tale.








    So the course is essentially a chance to spend a week singing some of the world's most beautiful music in a fairy tale setting with world-class musicians as your tutors.

    The tutors this year were Robert Hollingworth, Erik Van Nevel, and Francis Steele. Robert Hollingworth looked terribly serious to me in his publicity photo that I saw before I arrived, but within two seconds of meeting him I realized he was a quite charismatic and personable chap with a boyish charm that instantly puts people at ease. In every situation he had something amusing or insightful to say (and often both). I didn't get to work with him in sessions as much as the other tutors, but I did sing a Monteverdi madrigal in one of his sessions, which inspired me to give them another look -- they had never resonated with me before. Robert also led warm-ups, and really focused on getting rid of tension in our tongue and throat.

    Erik Van Nevel was full of passion; I'll never forget his coaching on "Christe, adoramus te" by Monteverdi. It is a relatively short piece that on the page doesn't look particularly dramatic, but Erik's interpretation of the piece had an extreme amount of emotion: one moment humble, the next desperate, the next pained. I am quite confident that even in the end we gave him only a fraction of the emotion he was asking for. Erik also offered insight about how to make sure our interpretations were stylistically appropriate to the music. Erik pointed out times when we were doing things that were musical, but using "tools" that were Baroque rather than Renaissance.

    And finally Francis Steele, who brings a deep reverence to the music and never-ceasing attention to the text. I have to admit that I've never given the text the care or attention it deserves in this music. I shame myself to say so, but in the past I've put forth the opinion that Latin serves primarily as a way to put a variety of vowels to polyphony. But Fran shows us all first-hand how much the music can improve when when you are sensitive to the text. He also demonstrated that resisting the urge to let the music "take off" volume-wise can make the music clearer and ultimately more enjoyable to sing.

    We all arrived on Saturday afternoon, a beautiful sunny day as nearly every day was. Anne directed us to our rooms so we could deposit our luggage, and I spent a while just wandering around with my jaw on the floor. My room was quite dark when I walked in, giving me the opportunity to bring it to life by opening the doors onto my own private balcony that looked out over the town of Roujan.




    We had our first singing session that night. We started with introductions, and I was somewhat surprised that the vast majority of the 17 attendees had attended the course before. I could already see that this was going to be a great experience, but it was still surprising that so many people were motivated enough to make the time and financial investment year after year. Nearly everyone was English, though some had been living in France for extended amounts of time. I was one of three Americans, and the only one currently living in America -- my trip from Seattle was by far the longest anyone had come for the course. However, to my surprise, the cooks (the exquisite, AMAZING cooks -- I'll get to that later) were originally from Seattle also.

    In our first session Fran led us all in Diliges Dominum by Byrd, an eight part piece of tall chords that move in a slow but steady half-note (or "minim" as I would concede to the Brits) pulse. I almost immediately acted in an instinct to add intensity to the piece by really leaning into rising lines, but Fran stopped us and said "don't try to come out of the texture." He pointed us to the text for the piece, which is "Thou shalt love the Lord thy God from thy whole heart, and with thy whole soul, and with thy whole mind. Thou shalt love thy neighbour as thyself." He argued that it is a corporal, corporate text that demands total togetherness. I should mention that my recounting of Fran's remarks cannot do justice to the way that he delivers them. He speaks of the music and the texts to which they are set with such sensitivity and humility that it inspires us to sing with an entirely different attitude toward the music than we might have walked through the door with.

    For me that opening session set a tone for the week: this was not going to be a week of powering through big pieces, it was going to be about sensitivity and awareness. These were lessons that were probably overdue for me as a musician: I've always been primarily motivated by melody, harmony, sonority, and musical brilliance, and haven't given text or meaning the attention it deserves. So this week was going to expand my horizons.

    After that session we had our first supper, made by our resident cooks who were a couple originally from Seattle but now living in France. I have to say that I was totally unprepared for what an effect the food would have on me. I eat out a lot and I definitely enjoy food, but food has never been a motivating factor in my life. At La Maison Verte I found myself eagerly awaiting every course, reveling in every new flavor. The sorts of things that foodies always say suddenly started making sense. I ate things that on paper I never would have imagined liking, like tomato and peach soup. The food was exquisite.

    The next several days followed a predictable format. There were three sessions of singing per day: two hours in the morning, one and a half hours in the early afternoon, and another hour and a half in the early evening. In each session we were divided into three groups among three rooms and three tutors. Everything was one on a part. Fran would lay out the schedule for these group sessions every morning, attempting to mix us up among all the singers, all the tutors, and all the rooms.

    The rehearsal spaces were a sitting room (the room where we ate breakfast), a small art gallery, and La Maison Verte's main rehearsal room. I had gone with some worry that I would find the rehearsal spaces frustratingly dry in their acoustic. What I found is that though they were certainly not reverberant like a church is, I never found myself frustrated about them. They didn't force one to work disproportionately hard. And they made our opportunity to sing in the church at the end of the week a great reward.

    Each evening at 6pm we would have a brief "sharing" of three or so of the nine pieces that had been sung that day. This was a chance to see what everyone else had been up to. I also much appreciated the repeated opportunities to "perform," since I have performance-related tension problems that manifest themselves even in the most informal sort of "performance." All in all I found that the course's itinerary worked very well. I even conceded once our day off came that it was good and necessary to let our voices rest for a day, even though I recoiled in horror when I first learned that we would have a whole day in the middle of the week without any scheduled singing!

    There is one part of the course's format that doesn't make it onto the itinerary but is nevertheless an absolutely essential part of the experience. You see, the last thing on the itinerary every day is supper at 8pm. What is not captured are the hours of talking, drunken singing, and general silliness that extend well into the morning, for as long as anyone will stay up. And one thing you can be sure of is that Fran will be the last one to retire, and until the very end will share with you a mix of silly banter and profound moments that it's hard to describe unless you know the guy. One night when there were only four of us remaining he whipped out a set of part books written in old notation, and we attempted to read through them despite the strange notes and unfamiliar clefs that can change from line to line! The next morning Fran tacked an extra event on the day's schedule: "12am: Josh lectures on singing from sixteenth century part books (with full illustrations by Josh)," which was especially devious given that I was clearly the most clueless about anything having to do with part book singing. But naturally by 12am that night I was suitably drunk that I heartily delivered a pseudo-intellectual treatise on the subject.

    Wednesday was our day off, and I was lucky enough to be able to tag along with a fellow participant who was taking a drive through the French countryside. She had an interest in Geology, and had marked on her map several places that had some geological significance. We stopped in several beautiful villages (again, none of which had any tourist bent whatsoever) and walked a ways up the Gorges d'Héric. On one of the more rural roads our way was blocked at one point by a cow and later by an entire herd of sheep that were being moved from one field to another. As I mentioned, I was initially disappointed that we had a day off, but in the end I was quite happy for the opportunity to see some of the surrounding area.

    The week culminated in a concert on Friday night at the town church, which was only a short walk from La Maison Verte. Despite the fact that we had been sharing every day, the concert was a great opportunity to have a bit more formal and prepared presentation of our music, and to sing for other people. The acoustic also felt very rewarding after the week of singing in smaller spaces.

    My week at La Maison Verte was fantastic, and I hope I get a chance to go back. The people were lovely, the tutors fabulous, the food exquisite, and the music deeply satisfying.
  • Permalink for 'Josh/2008/09/02/habes___2008_09_02T02_37_00'

    habes @ 2008-09-02T02:37:00

    Posted: September 2nd, 2008, 4:25am MDT by Josh
    Any fans of The Usual Suspects? I love this movie, but I'm always confounded by trying to piece together the plot given the ending.

    Ok, so Verbal Kint is Keyzer Soze. But there are stil so many questions I have about what really happened.

    1. Why did Keyzer/Verbal even get caught after the ship blew? He's such a bad-ass that he killed everyone on the ship and then blew it up, but he couldn't figure out a way to escape before the authorities arrived? Seems implausible.

    2. So Arturro Marquez ("The one guy who can identify Keyzer Soze") was the target of the whole operation on the ship. The story is that he was a smuggler who knew all about Keyzer Soze. He was arrested for drug trafficking in New York, but escaped to California where he was captured again. In California they were setting up extradition. He didn't want to go to jail so he named 50 names (including Keyzer Soze), but then escaped and fell into the hands of his own people, who decided to sell him into the hands of some Hungarians who wanted to stick it to Keyzer Soze.

    But his story doesn't really make sense. If he was going to be extradited (presumably to Argentina) then why is he doing a deal with the American authorities? Doesn't seem like that deal would do him much good once the extradition went through.

    3. Who killed Edie and why? Kujan thinks it was Keaton, but Keaton pretty clearly died on the boat. Verbal/Keyzer doesn't seem to have any reason to do away with her -- everything she knows about Arturro would be public record.

    4. How did the Hungarian who survived with burns know that the person he saw was Keyzer Soze?

    5. Who brought Edie onto the extradition case? It must have been someone related to Keyzer Soze (even if "Kobayashi" was not his real name) -- it couldn't have just been a coincidence. Whoever it was, can't Kujan use that connection to go straight to Soze?

    I'm sure the real reason is "relax, it's a movie." But I want everything to make sense in the end!
  • Permalink for 'Josh/2008/08/15/habes___2008_08_15T16_17_00'

    habes @ 2008-08-15T16:17:00

    Posted: August 15th, 2008, 5:18pm MDT by Josh
    To add to the thoughts about the Georgia/Russia thing, here's a video on YouTube where FoxNews interviews South Ossetians who take over the time slot by condemning Georgia and thanking the Russians.

    12-year-old girl: "but before I say anything else I just want to say that I was running from Georgian troops bombing our city, not Russian troops. I want to say "thank you" to the Russian troops that were helping us out."

    Not saying this thing is simple or one-sided, but there definitely is another side to this.

    Sarah, did you get in touch with your Georgian friend?
  • Permalink for 'Josh/2008/08/13/habes___2008_08_13T13_02_00'

    habes @ 2008-08-13T13:02:00

    Posted: August 13th, 2008, 2:24pm MDT by Josh
    What the heck is going on with our American response to this Russia/Georgia thing? Now maybe I'm just getting bad/biased information from the Wikipedia article on the South Ossetia war, but to me the situation is:

    * South Ossetia is a part of Georgia that is ethnically different from the rest of Georgia.
    * South Ossetia has been de facto independent from Georgia since the early 90s (though not internationally recognized)
    * South Ossetia overwhelmingly favors independence from Georgia, as evidenced by a 2006 referendum in which 99% of registered voters voted for independence with voter turnout at 95.2%.
    * after escalating tensions and small-scale fighting, Georgia launches a full-scale attack against the South Ossetia capital in the middle of the night.
    * Russia retaliates by pushing back the Georgian invasion and attacking Georgia.

    This is a complicated situation, but all we as Americans are offering in our media and from our leadership is a simplistic "evil Russia invades democratic Georgia" story. Our leaders and our presidential candidates give tough talk against Russia, without acknowledging the fact that Russia was retaliating against a Georgian invasion and attack against a city.

    At first I thought I was just crazy and missing something major, but apparently apparently I'm not the only one who feels this way.

    Geez, the last thing we need is to introduce more tension in our relationship with Russia. How would we react if Cuba or Venezuela invaded the Bahamas? You'd better believe that we'd not only drive them out, we'd launch strikes against military targets in their home countries. And how cynical would we be at that point if Russia spun it as "US aggression"?

Comments for Josh the Outspoken

2 Star

  • Permalink for '2_Star/2008/08/11/A_syntactic_dilemma__and_an_intro_to_Gazelle%e2%80%99s_ambiguity_resolution_'

    A syntactic dilemma (and an intro to Gazelle’s ambiguity resolution)

    Posted: August 11th, 2008, 1:55am MDT by josh

    As I’m adding more capabilities to the Gazelle grammar description language, I’ve come up against a problem for which I can’t find any solution I like. So I’m putting it to you, my esteemed readers, to suggest a solution I will like more than the ones I’ve already brainstormed.

    Here’s the problem. I’m at a point where I want to support prioritized choice in Gazelle grammars. Prioritized choice is a way of letting grammar writers explicitly resolve ambiguities that have crept into their grammar. Take the following grammar rule:

    s -> "X" | "Y";

    The two alternatives in this rule (X and Y) have equal priority. That doesn’t actually matter in this case, since no text can match them both, but now consider changing the rule a bit:

    s -> "X" | "X";

    Now there is a problem, because an X matches both alternatives. There are two legal parse trees that are both correct according to the grammar. As a result this grammar is ambiguous. A parser doesn’t have any reason to choose one over the other, because they are of equal priority.

    The solution to the problem is to let the user specify which alternative should be taken when both match. If we allow users to do this by introducing prioritized choice, our solution looks very much like how Parsing Expression Grammars (also known as PEG’s) work. And the syntax that has become very standard in the PEG literature is to use “/” as the operator for prioritized choice. So our ambiguous grammar would be made unambiguous by using prioritized choice:

    s -> "X" / "X";

    Now the parser will always choose the first alternative, and the grammar is no longer ambiguous. You might wonder “what’s the point?”, since the second alternative will never be taken, but in many other cases of ambiguity, the choice of lesser priority will be taken. The classic example of ambiguity (which can be resolved in this way) is if-then-else:

    stmt -> "if" expr "then" stmt |
            "if" expr "then" stmt "else" stmt |
            expr;
    expr -> "true" | "false" | /[0-9]+/;

    This grammar is ambiguous, because the fragment:

    if true then if true then 1 else 2

    …can be parsed in two different ways. The “else” could be assigned to either “if” — both are valid according to the grammar.

    So I definitely know I will be introducing prioritized choice. The question is what syntax to use. Like I mentioned before, I would really really like to use “/” since there is a strong convention from the PEG world for it. But unfortunately this introduces a major ambiguity in Gazelle’s own grammar, because it conflicts with the syntax for regex literals. What does this mean?

    s -> a / b;
    a -> c / d;

    Is that two rules that both use prioritized choice, or is it one rule that has a giant regular expression in it?

    s -> a / b;
    a -> c / d;

    I really don’t want to give up using slashes to introduce regular expression literals, because that is a really strong convention that a lot of programmers are familiar with. But I also really don’t want to give up using the really strongly established convention of using slash for prioritized choice.

    Anyone have an idea for a compromise I might like? The only one I can think of is adding a prefix to the regex literals sort of like Perl’s m{} form for regular expressions. It could be something like m/regex/. But I hate to add that uglification to the grammar, especially since regular expressions are going to be SO much more common than prioritized choice.

    Any ideas, anyone?

    Bonus: the other form of ambiguity resolution

    Choice/alternation isn’t the only possible source of grammar ambiguity. Gazelle’s constructs for optional and/or repeating grammar fragments can also be ambiguous. The if-the-else example is perhaps more idiomatically written in Gazelle like so:

    s -> "if" expr "then" stmt ("else" stmt)? | expr;

    It’s written differently, but it’s just as ambiguous as before. So we need the same sort of tool to resolve this ambiguity — a way to say “prefer to match the optional part” or “prefer to NOT match the optional part.” This is equivalent to defining a “greedyness” to these operators like Perl lets you do in regexes. The syntax I am planning for these is:

    s -> a b?; // no preference to match the b or not
    s -> a b+; // no preference to keep matching b or not
    s -> a b*; // no preference to keep matching b or not

    non-greedy variants
    s -> a b?-; // prefer to NOT match b (non-greedy)
    s -> a b+-; // prefer to STOP matching b (non-greedy)
    s -> a b*-; // prefer to STOP matching b (non-greedy)

    greedy variants
    s -> a b?+; // prefer to KEEP matching b (greedy)
    s -> a b++; // prefer to KEEP matching b (greedy)
    s -> a b*+; // prefer to KEEP matching b (greedy)

    I’m not following the Perl convention here (Perl uses “?” instead of “-” to make the match non-greedy). But on the other hand:

    1. Perl doesn’t have a way to explicitly make the match greedy — matches are greedy by default. So it has no convention for a “make greedier” operator.
    2. not very many people know/use Perl’s greediness operator anyway

    So to write the if-then-else example with explicit ambiguity resolution, you would write:

    "if" expr "then" stmt ("else" stmt)?+ | expr;

    …because when there is an ambiguity, we want to bind the else to the most recent if.

    An aside: “great, so I can just start slapping greedy and non-greedy modifiers on everything, just to be safe, right?”

    Not so fast there. Ambiguity resolution isn’t something you should take lightly, because a grammar with an ambiguity in it presents a user-facing irregularity in your language. For example, the if-then-else example is a user-facing issue: either “else” pairing makes logical sense, and it’s totally arbitrary how you decide to resolve the ambiguity. You have to tell your users about it. You should strive to absolutely minimize the numbers of such ambiguities in your language whenever you can!

    The danger of using ambiguity resolution operators willy-nilly is that you don’t necessarily know if they’re actually resolving ambiguities. For example, if you wrote:

    s -> "X" / "Y";

    …the prioritized choice here is completely extraneous, since this grammar is unambiguous already. Because ambiguities are so important to keep track of and educate your users about, you don’t want to have to wonder whether a prioritized choice is resolving an ambiguity or not. You want to know that it is.

    To this end, Gazelle will keep track of all ambiguity-resolution operators and error if any of them are never actually used to resolve an ambiguity. So you can’t just sprinkle them around “just to be safe,” you must only use them where they are effective. And this will make Gazelle grammars much more useful, because you will know exactly where all your ambiguities are.

    By the way, Gazelle can’t detect any ambiguity in your grammar; calculating whether an arbitrary context-free grammar is ambiguous or not is undecidable. Basically, all grammars fall into one of several categories:

    1. Gazelle can handle the grammar (ie. the grammar is in the green region of the diagram in my previous entry).
    2. Gazelle can handle the grammar except for the fact that it’s ambiguous.
    3. Gazelle cannot handle the grammar, and would still not be able to even if you resolved the ambiguities.

    If your grammar falls into category 3, Gazelle doesn’t know whether the grammar is ambiguous or not. But for grammars in categories 1 or 2, Gazelle knows exactly where the ambiguities are.

    Bonus #2: a word about Parsing Expression Grammars

    Something you might say (especially if you’re a PEG fan) is “gosh Josh, if you’re going to all this work to introduce ambiguity resolution, why not just use PEG’s flat-out and abandon all this LL(*) nonsense?” Because essentially what I’m doing is supporting both PEG-based constructs (like ordered choice) and context-free grammar constructs like non-ordered choice. And PEG’s support a larger set of grammars than Gazelle ever will (I believe they support any non-left-recursive PEG, though I don’t have a reference for this).

    There are two really major reasons why I don’t think PEG’s are the way to go. The first one is related to the point I was making before. With PEG’s all choice is prioritized choice. However, with PEG-based tools you have no idea if there are real ambiguities in your grammar or not. PEG’s are defined such that ambiguity does not exist, but this does not make if-then-else style issues go away, it just sweeps them under the rug. Even if you don’t call it “ambiguity,” it is still a user-facing issue that you as a language designer definitely want to know about! If you have the following in a PEG:

    s -> a / b;

    …you have no way of knowing if this represents an ambiguity, which means that you don’t know if changing the grammar to:

    s -> b / a;

    …will make a substantive difference or not! You’re flying blind.

    The second reason is that Gazelle’s parsing algorithm will be far, far faster than packrat parsing (which is the algorithm used to parse PEG’s). Both algorithms will have linear asymptotic complexity (though as I briefly mentioned before, I believe that LL(*) grammars can degrade to n^2 in degenerate cases; though I will argue that this will never happen in any sane language). But though both are linear, packrat parsing’s constant factor is far higher, and packrat parsing uses O(n) space where n is the size of input.

    This might all sound like hot air, but check out these real, hard numbers. On the homepage of the Aurochs parser generator, which uses PEG’s and packrat parsing to parse JavaScript, they say:

    How much memory does it use?

    An awful lot.

    Grammar Input size Memory consumption CPU time Memory overhead
    Javascript 140kb 380Mb 1s 2700
    Javascript 71kb 180Mb 0.45s 2500
    Javascript 1717 5.6Mb 0.01s 3200

    I can’t vouch for whether this is a great implementation of packrat parsing, but the authors sure seem to know what they’re doing. I hope you’ll agree that 380Mb of memory just to parse 140kb of text is horrendous, and 1s is quite slow too. Gazelle can’t parse JavaScript yet, but here is a rough ballpark estimate of what the performance will be like to parse 140kb of text once it can:

    • memory: Gazelle uses 24 bytes per entry on its stack, plus 12 bytes for every token of lookahead. These are the only non-fixed costs. If we assume a parse tree that goes 100 deep, and an input that requires 100 tokens of lookahead (both ridiculously large assumptions), Gazelle’s memory footprint is 3.6kb. This is 0.0009% of the packrat parser’s memory requirement.
    • CPU: Gazelle currently parses JSON at 18.8Mb/s. JavaScript will be more complicated than JSON to parse, but Gazelle will also undergo a lot of optimization that hasn’t been done yet, so I’ll go with the 18.8Mb/s figure. That’s 134x faster than the 140kb/s Aurochs is quoting.

    Like I said, the Aurochs authors seem to be very smart — I’m not saying they’ve done a bad job. I’m just saying that if Aurochs is any indication of what we can expect from packrat parsing, it is orders of magnitude slower, and takes enormous amounts of memory even for modest amounts of input text. If anyone has better numbers to show for packrat parsing, I’m all ears.

    So even though it’s more work, I’m 110% convinced that using this combination of LL(*) with explicit ambiguity resolution is the way to go, and is the best way to achieve the goals that I have set out for this project.

  • Permalink for '2_Star/2008/08/07/Clarification_and_apology'

    Clarification and apology

    Posted: August 7th, 2008, 3:53am MDT by josh

    Terence Parr (author of ANTLR) pointed out a mistake I made when I was describing how ANTLR deals with non-LL(*) grammars. I said that ANTLR’s option for dealing with non-LL(*) grammars is to do a depth-first search which runs in exponential time. While I believe this is true if you enable “backtrack=true” alone, ANTLR also has a “memoize=true” option that you can use together with backtrack=true to reduce the time complexity to linear, at the expense of more space.

    I also claimed that ANTLR can handle all grammars when backtrack=true is on. Terence corrected me and said that even with backtrack=true, ANTLR does not handle all grammars — for example left-recursive ones. I mistakenly thought that depth-first search parsing would be capable of handling left-recursion, but after some more thought and some time with my trusty Grune and Jacobs I understood why this is not the case. There are other top-down algorithms that can handle this, but backtracking depth-first search (with or without memoization) is not one of them.

    Apologies to Terence for my misstatements about ANTLR!

  • Permalink for '2_Star/2008/08/06/The_grammars_Gazelle_can_handle'

    The grammars Gazelle can handle

    Posted: August 6th, 2008, 1:57pm MDT by josh

    I wanted to follow-up a bit more on my recent posts about the grammars Gazelle and competing parsing frameworks can handle. Here is an Euler diagram showing my current understanding of what grammars Gazelle can handle compared with the well-known Strong-LL language classes as well as competing parsing frameworks.

    GrammarClasses.png

    Gazelle can currently generate lookahead for any LL(k) grammar, all LL(*) grammars that ANTLR can handle, and a few LL(*) grammars (namely tail-recursive ones) that ANTLR cannot. ANTLR can handle all LL(k) grammars and many LL(*) ones. SLK can handle only LL(k).

    Since my last entry, I gave Gazelle the capability to correctly generate lookahead for all LL(k) languages, as long as the user specifies a maximum k. This is in-line with what SLK and ANTLR do, since the problem is undecidable in general. By default I still use my heuristic that I think will detect in most cases whether a grammar will be LL(*) or LL(k) without making the user specify a maximum k, but this heuristic will be foiled in some cases. So if the user really thinks their grammar is LL(k) even though it failed the heuristic, they can specify a maximum “k” and Gazelle will keep searching until it hits that “k”.

    Just in case I wasn’t clear enough in my last entry, I strongly dispute the claims on the SLK website that SLK is the only framework that does true Strong-LL(k) analysis. As you can see from the diagram, both ANTLR and Gazelle can handle not only LL(k) but a superset thereof. Current and potential SLK customers: don’t be fooled by SLK’s claims! They haven’t been true for years — ANTLR has been able to handle LL(k) for a long time!

    Here are some examples of grammars that fall within the different regions of that Euler diagram:

    Strong-LL(*) but can’t be handled by Gazelle or ANTLR:
    s: a | b;
    a -> "(" a ")" | "X";
    b -> "(" b ")" | "Y";

    Can be handled by Gazelle but not ANTLR:
    s -> a "X" | a "Y";
    a -> ("Z" a)?;

    Can be handled by ANTLR but not SLK (and is not Strong-LL(k)):
    s -> a "X" | a "Y";
    a -> "Z"*;

    A note about generalized parsing

    When I have talked about what grammars ANTLR can “handle,” I’m specifically talking about the grammars it can generate LL(*) lookahead for, without the help of syntactic or semantic predicates. Put more plainly, these are the grammars that ANTLR can parse efficiently and without any hints from the user.

    Why the caveats? Well you can put ANTLR and comparable tools in a mode where it parses by depth-first search, using backtracking to abandon search attempts that did not work out. This is equivalent to what most regexp implementations (Perl, Python, Ruby, etc) do, and as Russ Cox has demonstrated this technique has severe (exponential) worst case performance. If you want to trade space for time, you can memoize this depth-first search and get linear time at the cost of some space. While I think these techniques are totally legitimate as a way of expanding the grammars the tool can handle, I personally am singularly focused on parsing in linear time with very low space overhead. I am doing everything I possibly can to parse real languages (everything from JSON to C++) without going outside these bounds. That’s what’s going to make it practical to use Gazelle grammars for things like syntax highlighting in text editors.

    Secondly, ANTLR has features that let the user help the parser out with its decisions, either by writing Java code (semantic predicates) or by specifying syntax fragments that the user promises will properly disambiguate alternatives (syntactic predicates). It’s always useful to have more tools of this sort, and Gazelle will certainly include some with time, but again this is an apples to oranges comparison.

    Finally, Adrian Thurston (author of the popular Ragel state machine compiler) mentioned on one of my previous entries that there exist general (eg. can handle any grammar) algorithms for top-down parsing, which are automatically more powerful than Gazelle. While this is true, none of them are what I consider practical. Gazelle’s time/space complexity are:

    Time: linear
    (actually LL(*) could probably become quadratic in degenerate cases, but LL(k) is definitely linear and LL(*) is too, for all intents and practical purposes).

    Space: max stack depth + max lookahead depth
    Both can be capped. In a bit more detail:

    • The maximum stack depth represents the nesting level (for example, number of nested parentheses).
    • The maximum lookahead depth represents the number of tokens we have to look ahead in the worst case to disambiguate alternatives.
    • Input text that doesn’t respect the maximum depth limits will be rejected (even if it is valid according to the grammar), but that’s the only way to prevent malicious input text from consuming arbitrary amounts of memory. Clients that don’t mind getting DoS’d can just set these limits to unlimited.

    No general method can approach this space/time complexity. And I think it is for this reason that no tool (AFAIK) which is aimed at practical use implements a generalized top-down method.

    I’d like to post a slightly more detailed run-down of the generalized methods Adrian mentioned when I have the time to take a closer look at them.

    So to make a long story short, I stand by my previous claim that Gazelle is (at least briefly) the most powerful practical top-down parsing framework there is.

  • Permalink for '2_Star/2008/08/04/SVG_support_in_Firefox_3'

    SVG support in Firefox 3

    Posted: August 4th, 2008, 11:33am MDT by josh

    Whoa! Am I the only one who didn’t get the memo that suddenly Firefox has really impressive SVG support? Check out these demos and be amazed at what you didn’t think your browser (sans Flash) could do.

    Is there any more point to the HTML <canvas> element?

    Does anyone know if this notion of a “compound document” is standardized? Eg. where I can have an HTML page with an SVG snippit embedded, and I can manipulate it from JavaScript as one big DOM? I know this works in Firefox (as the above linked-to examples demonsrate), but will other browsers that add SVG support work the same way? If so, the standard HTML+CSS+JavaScript platform just got a *lot* more capable.

    There is the capability for doing animation (as this demo illustrates) by using JavaScript’s setTimeout or setInterval to run some JavaScript code periodically that updates the DOM. This seems to work decently, though you have to do all the math yourself and the animation can be a little choppy (since it’s all JavaScript). It appears that SVG also defines animations that you can specify declaratively which seems like it would both let you express animations at a higher level (”set this element into motion based on a spline with these key points”) and would probably get you smoother results since it the animation would be implemented at the browser level instead of JavaScript. But Firefox doesn’t currently support any of the animation module.

  • Permalink for '2_Star/2008/07/29/A_partial_retraction_to_my_last_entry'

    A partial retraction to my last entry

    Posted: July 29th, 2008, 2:25am MDT by josh

    From a theoretical standpoint, my last entry was incorrect when I said that my new algorithm in Gazelle can decide whether a grammar is strong-LL(k) or strong-LL(*). I had proofs against me — these problems have actually been proven undecidable.

    Once I heard this result, I spent hours trying to think of a grammar that is strong-LL(k), but that Gazelle will say is not. I am fairly certain that Gazelle’s algorithm will always terminate, so if the proof is to be believed, there must be a grammar that Gazelle falsely claims is not strong-LL(k) or strong-LL(*).

    After many hours, I came up with one. Here it is:

    s -> "X"* "Y" "Y" "Z"| "X" c;
    c -> "Y" c "Y" | "Q";

    This grammar is strong-LL(5), but Gazelle falsely claims it is not strong-LL(k) for any k. ANTLR can handle it (as long as the recursion depth is set high enough).

    I’m thinking, though, that this isn’t really a problem. My intuition says that it will be rare that anyone actually tries to write a grammar that fails in this way — after all, it took an evening of specifically trying to exercise this case before I could find a grammar that did. I don’t think real grammars will run into this problem.

    So even though I can’t handle all of strong-LL(*), I think I can handle the grammars that will actually come up in practical use, and I still have the extremely desirable property of being able to decide whether a grammar fits or not without forcing the user to tweak recursion depth constants. So although my result isn’t as theoretically significant as I thought, I think that from a practical standpoint I’m still in good shape: I have a terminating algorithm that can handle a very useful set of grammars.

Comments for Josh the Outspoken

  • Permalink for 'Comments_for_Josh_the_Outspoken/2008/07/28/Comment_on_Gazelle_is__at_least_briefly__the_most_powerful_top_down_parser_generator_there_is_by_josh'

    Comment on Gazelle is (at least briefly) the most powerful top-down parser generator there is by josh

    Posted: July 28th, 2008, 11:12am MDT by josh

    @maetl: thanks! It’s funny, when I started writing Gazelle, I believed pretty strongly in that mantra (”parsing and grammars are amongst the most studied and well understood areas of computer science”). I didn’t expect to invent any new algorithms, I mainly just wanted to take existing algorithms and give them different packaging.

    But before long I was trying to figure out if I could improve on the properties of those algorithms that made them less that ideal from a user perspective. I studied the documentation from existing tools and looked at the cases they said they couldn’t handle, and tried to figure out if I could handle them. And before I knew it, I had thought of a new algorithm!

    @Kay: thanks! I agree that having the tool be able to tell you definitively that it is not LL(k) or LL(*) will make things much more user-friendly. I could still use a bit of work on error messages that give you every bit of information about *why* it was not LL(k) or LL(*). In the future I should be able to draw pretty graphs that make the problem obvious. That will take a bit of work, but it’s totally doable, and I think will make for a compelling user experience.

2 Star

  • Permalink for '2_Star/2008/07/27/Gazelle_is__at_least_briefly__the_most_powerful_top_down_parser_generator_there_is'

    Gazelle is (at least briefly) the most powerful top-down parser generator there is

    Posted: July 27th, 2008, 7:27pm MDT by josh

    On Wednesday when I was sitting at home waiting for the cable guy, I thought very hard about top-down (LL) parsing and lookahead generation. After a few hours of thinking I had an epiphany where I very rapidly discovered several major results that I do not believe have ever been discovered before.

    I have implemented several (but not all) of these discoveries in the current Git version of Gazelle, and as far as I know, this makes Gazelle currently the most powerful top-down parser in existence.

    (One caveat for the above statement — when I say “powerful” I am referring to the set of grammars it can successfully generate lookahead for. Other parser generators still have features that Gazelle does not have, like syntactic and semantic predicates. And alas, Gazelle still lacks a way to ignore whitespace until the time that I redesign this capability in a way that I am happy with).

    As far as practical and powerful top-down parser generators go, I am only aware of two serious contenders. In this context, “serious contenders” means “can handle LL(k) for arbitrary k or better.” They are:

    • SLK, a commercial LL(k) parser generator. I dare not download or try it out, because I don’t want to be in any way entangled with their license agreement (which prevents reverse engineering) or tainted by what I learn about it.
    • ANTLR, which is a popular open-source parser generator written in Java. Terence Parr pioneered the LL(*) technique on which Gazelle builds, and Terence deserves a lot of credit for laying the groundwork in this field. It would have taken me a lot longer to get Gazelle where it is if not for his work.

      As a side note, Despite what the tough talk on the SLK homepage says, ANTLR has been LL(k) for years (LL(*) is a superset of LL(k)).

    Here are the key discoveries I made while the cable guy was assisting other customers. Apologies if some of this material is gibberish to people who aren’t deep in the field — I’ll attempt to summarize the high-level impacts of each discovery.

    Computing whether a grammar is LL(k) or LL(*) is decidable, without specifying a maximum k!

    What does this mean, in practical terms? Today, when you use either SLK or ANTLR, you have to specify how long they should keep searching for correct lookahead before they give up. For SLK (which supports only LL(k)) you specify a -k command-line argument which is the maximum number of lookahead tokens to consider before giving up. For ANTLR which supports both LL(k) and LL(*), you specify both a -k and a maximum recursion depth constant. (Correction from Terence Parr).

    Again, what does this mean? As a user of SLK or ANTLR, you have to choose an arbitrary number for these parameters. If you choose too low, you’ll make them give up before they find the answer. If you choose too high, you’ll have to wait longer before they fail in cases where the grammar is simply not LL(k) or LL(*). This is annoying and unintuitive for a user.

    What I discovered is that this problem is decidable: you can write an algorithm that will always terminate that either finds the lookahead successfully or tell the user that the grammar is not LL(*) (and why). This is much better for users, because you don’t have to worry about those constants and have to worry that you set them too low. Big win for users.

    Tail recursion is a-ok

    ANTLR’s algorithm currently cannot handle rules of this sort:

    s -> ("X" s)?

    This is a simple tail-recursive grammar that matches any number of X’s. ANTLR balks at this, but if you apply a tail-recursive optimization very much like the one implemented in imperative languages (don’t bother making a new stack frame when you re-enter the tail-recursive rule), a top-down parser can handle this just fine. Which is what Gazelle does.

    This isn’t a huge deal, since the grammar is better written:

    s -> "X"+;

    But it’s nice to be able to handle more cases.

    LL(*) can be extended to support full-LL(*) parsing

    This probably won’t make much sense to anyone who’s not deep in this field. At a high level, full-LL supports more grammars than strong-LL (which is what everyone means when they say “LL”), but full-LL parsing has not ever been considered practical. I haven’t tried it out yet, but I’m pretty sure I have a way of extending LL(*) to support full-LL(*) while still being practical and efficient. Bottom line: again, we could support more grammars. For example, here is a grammar that is full-LL(2) but not LL(2):

    s -> "A" a "A" "A" | "B" a "B" "A";
    a -> "B"?;

    Will full-LL be that important in practice? I’m guessing not. But from a theoretical perspective this result is important because I believe it will be the first practical, efficient method anyone has ever proposed for parsing full-LL.

    LL(*) could handle even more grammars by using regular supersets of nonregular lookahead

    LL(*) currently can only handle situations where the part of the grammar directly following a decision is regular. In some cases, that part of the grammar is not regular because it counts (for example, nested parentheses), but it doesn’t matter because all you’re really trying to do is skip past that part to a distinguishing token. For example, this grammar is from “The Definitive ANTLR Reference”:

    se -> e "%" | e "!";
    e -> "(" e ")" | "ID";

    In this grammar, “e” is nonregular because it matches parentheses. Regular expressions cannot count and therefore cannot match parentheses. However, if we’re trying to make a prediction at the beginning of “se”, all we’re really trying to do is skip past all that junk to get to either a “%” or a “!”. A regular expression would suffice — something like /\(*ID\)*/. The regex doesn’t count, but it skips past all those nested parentheses just fine. If the parentheses aren’t properly nested, we’ll figure that out later.

    What this means for Gazelle

    I made all these discoveries on Wednesday. On Saturday I had a day of mind-blowing productivity and implemented the first two of them, in addition to fixing all the previously-existing shortcomings that 0.2 had. So like I mentioned, Gazelle is currently (as far as I know) the most powerful top-down parser generator there is. Specifically, no other existing top-down parser generator can, at the moment:

    • handle tail-recursive grammars
    • tell you definitively when a grammar is not LL(k) or LL(*) (others make you tweak search depth constants)

    A few days ago I emailed Terence Parr (the author of ANTLR) telling him about these findings, and I expect that he will probably add these enhancements to ANTLR until he has feature parity with Gazelle. But I just wanted to take a moment to enjoy that I made a significant discovery in an important field. The second of the above points especially is just huge.

    As one final closing remark, I wanted to mention a grammar that the SLK documentation presents to boast its power. It is an LL(27) grammar with 26 terminal types. Here is what they have to say about it:

    The following grammar is LL(27), and has an alphabet of 26 terminals. So the possible number of lookahead strings is 2627, or about 160000000000000000000000000000000000000. SLK can analyze this grammar and construct a parser for it in about one second.

    [snip grammar]

    The -y option needs to be used because the grammar is in yacc syntax. And -k=27 or larger also needs to be specified. The output of SLK for this grammar is large, so it should be redirected to a file.

    Just to demonstrate that Gazelle can handle this, I checked this grammar in Gazelle format into sketches/ll27.gzl in the source tree. If you run gzlc on it with the --no-minimize-rtns option (minimization is a nice feature that is convenient for real grammars but too inefficient for this diabolical grammar), Gazelle generates correct lookahead in half a second (and this is totally unoptimized code written in Lua, a byte-code interpreted language), doesn’t require you to specify a -k=27 option, and the entire output is about 5k.

    I normally wouldn’t seek to put other parser writers down, but the SLK web page rubs me the wrong way when it falsely claims that:

    SLK is the only LL(k) parser generator available. All others that claim to be LL(k) are not really LL(k) by any of the classical definitions of LL(k) found in the literature. SLK does a full LL(k) analysis of the grammar. The proprietary SLK algorithm is the only known near-solution to this NP-complete problem. Other deterministic algorithms must either limit k to a very small value, usually two, or use approximate lookahead methods that are very weak imitations of LL(k).

    As I mentioned before, this has not been true for a while because of ANTLR, and it is even less true now with Gazelle, which is strictly more powerful than SLK.

    Update: I made a small correction that I received from Terence Parr. He also says he’s pretty sure that determining the k for an LL(k) language is undecideable, so we’ll see, maybe there’s something I’ve missed. I feel pretty confident about my solution, but I certainly could be wrong.

  • Permalink for '2_Star/2008/07/26/Good_experience_with_Comcast'

    Good experience with Comcast

    Posted: July 26th, 2008, 10:35am MDT by josh

    There’s a story on Slashdot right now about how Comcast is reading blogs that mention them as a way to try and improve their image among customers. I guess Comcast has a generally bad reputation when it comes to customer service.

    I just wanted to mention that I personally had a good experience with Comcast this week (not perfect, but quite good). I think that if a company is really trying to turn itself around and provide good service, they should be recognized for it, so this is my part in doing so.

    This week I needed to move my internet service from my old apartment into my new condo. I called Comcast customer service on Sunday at 6PM (+1 for having 24/7 customer service) to schedule an appointment. The 800 number made me choose 2 or 3 menu options (+/-0) after which I talked to a human being almost immediately (+1). The person I talked to was friendly and helpful (+1), though the first available appointment wasn’t until Wednesday 8-12 (+/-0). The cs rep gave me a discount on the visit which is apparently $100 and cut it down to $10 for me (+1!).

    On Wednesday I stayed home from work from 8-12 waiting for the rep. I didn’t wake up until about 8:30am and then I took a shower, so by nine I was slightly concerned by the possibility that I could have missed the field tech. I just didn’t want to find at noon that I had waited four hours in vain and would have to stay home from work for another four hours on a different day. So I called the 800 number again, again was able to talk to a rep almost immediately, who again was friendly and helpful and able to answer my question right away — they could confirm that the tech hadn’t visited me yet (+1). So I waited at home until noon.

    By noon the field tech still hadn’t arrived (-1). I called the 800 number, again almost immediately talked to someone friendly and helpful, and the rep said they’d contact the field agent and have him call me. 10 minutes later the field agent calls me and says he’ll be there in 5 minutes. 5 minutes later he’s at my doorstep.

    The field agent was also friendly and helpful (+1) — he was happy to explain a lot about what he was doing, he took the time to do the job right, and he even ran speakeasy’s bandwidth test on my computer once he was done. When he left he gave me his card and said I could contact him with any problems.

    So it looks like Comcast is really doing the right things to turn their image around. And I think that should be recognized (at least for as long as the level of service is like what I experienced this week).