Friday, November 09, 2007

Hacking Open Office Writer Lists


Here's a tip I'd like to share. It's about Writer, the OpenOffice.org Word Processor, but all concepts and described functionality are available als in Microsoft Word.

This tip will show you how to create styles to create notes in your documents. Among the more interesting things, the style does not only take care of the formatting, it also takes care of the static text that is repeated for every note. All this achieved without using macro automation.

A Note

Everybody that has written a manual or a business report has at some point dealt with things like this:

ooo-note

Yup, a note. It is essentially a paragraph with a gray background, a black border, and some extra space on all sides. And, oh, yeah...there is a little bold, capitalized caption that reads NOTE:. That's all it is.

Now, I bet that most people won't have much of an issue getting the word processor to format a paragraph to make it look like this. If you do have some trouble with that, hang on - a recipe is coming up.

Of course, it gets a bit annoying when you have to repeat this markup. I have no way to back this up, but it is my hunch that most people solve this simply by copying the paragraph, pasting it, and then editing the text.

(Well? Is it true? Don't be ashamed, post a comment and confess ;-)

Some people might look ahead a bit farther and consolidate the style for reuse. Let's take a look how that works:

Styles and Formatting

First open the styles and formatting toolbox. You can access it via Menu: "Format" > "Styles And Formatting" or by hitting the F11 key on your keyboard.

ooo-styles-and-formatting-menu

The Styles and formatting toolbox appears. I usually dock it to the right side of the Writer application window. This makes it very easy to apply the styles to the text at the current cursor position.

ooo-styles-and-formatting-window

Since we are styling a paragraph to look like a note, we need to click the first button on the little toolbar in the top of the Styles and Formatting toolbox to list the available paragraph styles.

In the list of paragraph styles, we will see that the current paragraph style is highlighted. Assuming your cursor was in a plain paragraph, this will be the "Default" style.

Now, we want to create a new paragraph style especially for our Note. To create a new style, right click in the list, and choose "New" in the context menu. This pops up the the paragraph style dialog:

ooo-paragraph-style

First, we need to give it a proper, sensible Name. Here, we dubbed it "MyNote". By default, Writer assumes it must use the new style also as style for the Next paragraph. Here I deviate from the default, and choose the "Default" style again as Next Style.

One of the most powerful features of styles is that they may be Linked With other styles. The new style is based on the style to which it is linked, effectively inheriting its properties. In this example, it means that changing, say, the font-size of the "Default" style will also change the font-size of the "MyNote" style accordingly.

Of course, "MyNote" may locally change its own font-size, and then that property will be preserved even if the "Default" style changes the font-size. In other words, the new style can override the properties of the style it is linked with.

Finally, a quick note about the style category listbox. By default, a new style is placed in the Custom Styles category. The Hierarchical category is also useful, as it allows you to see which styles are linked to each other:

styles-and-formatting-window-hierarchical

So, this is something to keep in mind when you plan to change a style, as this view allows you to quickly see which other styles will be affected by the change.

Ok - now that we dealt with these logical properties, we set the indents and spacing:

ooo-paragraph-style-spacing

Then, we make sure that all text in one "MyNote" paragraph is not broken up at page breaks:

ooo-paragraph-style-flow

This is taken care of by the "Do not split paragraph" checkbox.

And we set the background color:

ooo-paragraph-style-background

And finally the border style:

ooo-paragraph-style-border

The "Merge with next paragraph style" ensures that the border does not separate two consecutive paragraphs that are of the same style. In other words, it allows us to create notes that span more than one paragraph.

When we are all done defining the style, we simply confirm the dialog and close it.

Applying the style

Now that we created the style, we can apply it. First, position the cursor at the desired paragraph:

ooo-apply-style1

Then, double click the style in the "Styles and Formatting" toolbox:

ooo-apply-style2

Alternatively, you can toggle the Fill Format Mode button. Your mouse pointer then changes into a little bucket. In this mode, single clicking on a paragraph applies the selected style.

I explained that I like to dock the "Styles and Formatting" toolbox to the right side of the Writer Application window so I can easily apply styles where needed.

Are we done yet?

So, are we happy now? Well, actually...no.

The note we created manually has a little bold capital NOTE: caption. Of course, the first time around we simply typed that text in the note paragraph and marked it up in an ad-hoc manner.

Can we do better this time? Well, kind of...

First, lets see if we can consolidate the markup we use for the NOTE: caption. This time, we need to create a Character Style, because this time, we want to markup a stretch of text that appears within a paragraph.

ooo-character-style-toolbox

So, we create a character style linked to the "Default" character style. The procedure is completely comparable to creating a paragraph style. We first give it a sensible Name:

ooo-character-style-dialog

Like with paragraph styles, the character style is based on the style it is linked with. However, here, we cannot specify the next style. Instead, applying a character style automatically stays in effect until you switch to another character style.

We then make the font bold:

ooo-character-style-dialog-bold

And finally, we choose a Capitals effect:

ooo-character-style-dialog-effects

We can then confirm and close the dialog.

Now that we created the character style, we can apply it, and type text using this character style. We can either select a piece of text an change the selection to use the character style, or we can toggle the character style and type text in that style.

So, does this do the trick?

Well, it's better now than it was before. At least, now can ensure that all our little NOTE: captions look the same, and that their formatting is updated when we change the style.

However, the problem is that we still have to type this text all the time. It is not only tedious, it is also error prone. Ideally, we would like to somehow have the word processor treat this text as part of the paragraph style instead of it being just text.

As it turns out, there is a clean, reliable way to achieve this. Enter List styles.

List styles are normally used to create bulleted or numbered lists. Each list item is essentially a paragraph that is preceded by a bullet or a number which is generated automatically by the word processor.

This is the key to solving the problem. If we can somehow create a list style that inserts the text NOTE instead of a bullet or a number, we can use it to implement our note caption. So, let's try and create one:

Again, the first step of the process is very similar to creating a paragraph or a character style. We click the list style button, and right click in the "Styles and Formatting" toolbox:

oooo-styles-and-formatting-list

The list style dialog opens, and we provide a name for the list style:

oooo-list-style-dialog

Note however that list styles cannot be linked to other list styles.

We then specify that this list style is a numbering style. This may not really make sense at this point, but we really do need this step. That is because numbering styles allow us to prefix and postfix the list number with a piece of text - this is what we will use to add our Note Caption.

oooo-list-style-dialog-numbering

Now that we specified this style as a numbering style rather than a list style, a number of options become available:

oooo-list-style-dialog-options

Here, we specify that although it is a numbering style, we actually don't want any numbers to be inserted. This is specified by choosing None for Numbering. We also specify that we want the text Note: to appear before the list number. (Note however that we did specify that the list number is to be displayed using "None" - in other words, no number will actually be displayed ;-). Finally, we also specify to use our custom character style for the list number text - it is the character style we just created for our NOTE: captions.

Note that this step is the actual hack that specifies that the static NOTE caption text should be generated by the word processor whenever this list style is applied.

The final step in defining the list style is to undo any automatic indent that is applied to the list 'number'.

oooo-list-style-dialog-position

Checking the Relative checkbox makes the indent be relative to the current position of this list item in the list hierarchy. But we are not interested in this type of indent at all - we have specified the indent for at the level of the "MyNote" paragraph style. So we want to get rid of any additional indenting. So, we check the checkbox and specify that we want zero indenting relative to the current position in the list hierarchy.

(It might not be clear at this moment, but this step really is essential in order to allow a note to span multiple paragraphs.)

Applying the list style

Applying a list style is different from applying character or paragraph styles. In order to apply a list style, it needs to be attached to a paragraph style. This ensures that the paragraph starts behaving as a list item.

Please observe that attaching a list style to a paragraph style is different from linking a paragraph style to a paragraph style. A link is configured in the Linked With property in the Organizer tab of the paragraph style dialog. Linking causes the paragraph style that links to the other style to inherit its properties. Attaching a list style is done in the Numbering tab of the paragraph style dialog by setting the Numbering style property. Specifying a numbering style makes the paragraph style behave like a list item but does not change the properties of the paragraph style itself.

We might be tempted at this point to simply attach the list style to our "MyNote" paragraph style. This will work: all "MyNote" paragraphs will be starting automatically with the Note: caption.

However, directly attaching the "MyNote" paragraph style to the list style is not a good idea beacause all "MyNote" paragraphs will then start with the Note: caption: even two consecutive paragraphs in the "MyNote" style that together make up one note would each have the NOTE: caption. So, specifying the numbering style in the "MyNote" paragraph style would prevent us from creating Notes with multiple paragraphs.

The solution to that problem is create a new paragraph style called "MyNoteFirstLine", linked to the original "MyNote" style:

ooo-MyNoteFirst

To this new "MyNoteFirstLine" style we then attach our list style:

ooo-MyNoteFirstNumbering

By linking it to the original "MyNote" style, we ensure that there is still a single place that defines all of the "normal" Note formatting. So, if we want to change, say, the background of our notes, we will only need to modify the "MyNote" style, and the changes will cascade through to the "MyNoteFirstLine" style. By Setting the Next Style also to the MyNote style, we ensure that only the first paragraph of our note gets the Note: caption, and that we can continue the note without the static Note caption being generated.

So, here's the final result. We apply the "MyNoteFirstLine" paragraph style to initiate a Note:

ooo-Note1

Then we type some text. Once we hit Enter, a new "MyNote" paragraph is created:

ooo-Note2

and the result is a note spanning two paragraphs, with only the first paragraph having the static NOTE: caption.

Obviously, there is now too much white space between the paragraphs within the note. That's because both the "MyNote" as well as the "MyNoteFirstLine" styles have the same settings for "Indents and spacing". However, it is trivial to tweak the settings to achieve a more normal spacing.

To do so, you simply have to modify the "MyNote" style and decrease the spacing that appears above the paragraph. That change will of course cascade through to the "MyNoteFirstLine" paragraph style. But we do want to make sure that the spacing occurs above all "MyNoteFirstLine" paragraphs in order to separate the Note from the normal paragraphs.

So, instead, we need to specify the spacing to appear above the "MyNoteFirstLine" paragraph style. You likely also want to decrease the spacing below the "MyNoteFirstLine" paragraph style in order to make it appear more closely to the remaining paragraphs in the note.

We will leave this as an exercise to te reader.

Summary

This might seem like a long post, but it an be quickly summarized: To create a note like this you:
  • Create a generic "Note" style
  • Base a "First" paragraph style off the "Note" style by linking to it, and make sure the Next style is the Note style.
  • Create a character style for the Note caption.
  • Create a list style for the Note caption and configure it to use the character style. Then, attach the list style to "First" paragraph style

Random RDBMS and SQL Myths debunked

A few times now, I've been wanting to write this down. I know: a lot of people will go *shrug*. Others may find me pedantic. Some of will say I'm being a smart-ass. Whatever...but I just got to write down a few of these common misconceptions that keep floating around.

None of these misconceptions are really harmful - in most cases, they do not lead to misunderstanding or miscommunication. However, when you are writing about these subjects, you'll often find that a sloppy definition you used in some place will bite you in the tail, and make it harder to explain something later on. So, that is why I from time to time get kind of obsessed with finding just the right words.

I'm not pretending I have the right words though. But there are a few informal ways of saying things that at a glance look right but are in fact wrong. Here's a random list of some of them:

PRIMARY KEY and UNIQUE constraints are unique indexes

Wrong - an index is just a convenient implementation to detect duplicate entries, and this is used by all RDBMS-es I am familiar with to implement PRIMARY KEY and UNIQUE constraints. However, the fact that there is a distinction is evident in for example the Oracle SQL syntax. For example, in ALTER TABLE ... DROP CONSTRAINT you can specify whether the associated index should be kept or also discarded.

Some people argue that it does not make sense to make the distinction in case the RDBMS does not maintain the constraint and index as separate objects. (This is the case in for example MySQL.)

Well, maybe...but I disagree. When I think about constraints, I'm thinking about business rules and guarding them to maintain database integrity. When talking about indexes, I'm thinking about performance and access paths. Quite different things, and in my opinion a shame to throw away the words to express the difference in my opinion.



A table consists of rows and columns

No - there is nothing wrong with an empty table. In other words, it does not consist of rows. It may or may not contain rows, but that is a different story.


A scalar subquery returns one column and one row

Wrong - first of all, a scalar subquery may return zero rows, in which case it evaluates to null, which is perfectly valid. But there is more to it.

Whether something is or is not a subquery is matter of syntaxis. The SQL grammer is defined so that if you encounter a query between parenthesis where a scalar value is appropriate, then that query (including the parentheses) will be parsed as a scalar subquery. In other words, the text satisfies the production rule for the non-terminal symbol "scalar subquery".

The parser will usually be smart enough to verify whether the subquery yields one column, but the number of rows returned is a runtime affair.

Suppose the query that makes up the scalar subquery would in fact return more than one row...would it suddenly not be a scalar subquery anymore? Of course not. It is still a scalar subquery - it just happens to be impossible to execute it. In other words, it violates the semantics of a scalar subquery and is therefore invalid. But the mere fact that we can conlcude that must imply that it is a scalar subquery.


A subquery is a SELECT statement that appears as a part of another SELECT statement

Wrong - For the same reasons as the previous issue. A statement is a syntactical construct. It has to do with discovering a pattern in a piece of text so that it satisfies a particular rule in the SQL grammer. That grammar does not have a rule that allows statements to be nested - not in pure SQL anyway (Of course, in stored procedures, one can have statement blocks like BEGIN...END, IF...END IF etc that really can contain other statements)

Of course, if we would take the SELECT that makes up the subquery and run it in isolation, it would be a SELECT-statement. Bit that is exactly the heart of the matter: because we are regarging it as part of another statement, it cannnot be a statement itself. This is simply a matter of definition of course - most people will immediately understand what is meant.

What would be better to say though is that a subquery is a query or query expression that appears as part of another SQL statement. However, this is also not correct: CREATE VIEW for example does contain a query expression, but this would most likely not be called a subquery. For this particular case, you can argue that there is nothing sub-ish about the query expression, because it is simply an essential part of the CREATE VIEW statement.

But what to think of CREATE TABLE...AS SELECT... and INSERT INTO...SELECT? The query expression is certainly not an essential part of CREATE TABLE and INSERT INTO, and in that sense, the query does look like it is subordinate to the statement it is part of.

You could argue that a query is a subquery if it appears inside another query. That seems sound, but what to think of UPDATE ... SET = (SELECT ...)? Personally I am reluctant to call an UPDATE statement a query - I tend to think of a query as a SELECT statement or sometimes a query expression.

I can think of only one thing that really is a defining characteristic of a subquery though - that is that the query expression must appear within parentheses. So, again, a matter of syntax more than a matter of semantics. I must admit I'm still not very satisfied with this though...What do you think?


NULL is the absence of a value

Variants of this statement go like "NULL is a missing value" or "NULL is not a value".

With slight doubt, I say: wrong. It certainly is true that many people use NULL to convey that something is not there or that something is not applicable. But this is a matter of choice, it does not change the meaning of NULL itself. If we use the same line of reasoning as we used for the subquery myth, we must conclude that NULL is certainly a valid value expression. It can legally appear anywhere where we can put a value. It is IMO also perfectly ok to say things like "...that expression evaluates to NULL".

So what does the SQL standard say? Well, here's a quote:

...the null value is neither equal to any other value nor not equal to any other value — it is unknown
whether or not it is equal to any given value....

So, I'm in that camp too: NULL is a value, and if we have a NULL in say, the integer domain, we just don't know which of all possible integers it is.


Foreign keys must reference a primary key

Wrong - a Unique constraint is mostly just as acceptable.

In MySQLs InnoDB it is even more relaxed - the foreign key only needs to reference the prefix of an index in the parent table, although this is so exotic, it should probably be ignored.


This table is not normalized - it still contains redundancy

Wrong - a table is normalized when it is in the first Normal form. There are a few different opinions what that means exactly, but it usually works to say that a table is not normalized when it contains repeating groups.

A slightly stronger statement is to say that a table is not normalized when it contains data that is not atomic. This is stronger because it does not cover only repeating groups, but also columns that, for a single row, do not contain a single value. For example, a first name/last name combination in one column is not atomic, and therefore, a table that contains such values is not normalized. (There are opinions that require even more than this, but for practical purposes the sense of atomic values works pretty well.)

The source of confusion is in what happens beyond the first normal form. Although a table maybe normalized, it can still contain redundancy. By removing redundancy, you can progressively achieve a higher normal form. In many cases, one would require at least third normal form or the Boyce-Codd normal form for building database schemas. Many people say "normalized" when they actually mean "in at least the third normal form".



So - what do you think? Pedantic? Have other myths? Maybe you have a good, satisfactory definition for subqueries? Or maybe you find an error in my debunkings?

Just drop me a comment on this post - thanks in advance.

DuckDB Bag of Tricks: Reading JSON, Data Type Detection, and Query Performance

DuckDB bag of tricks is the banner I use on this blog to post my tips and tricks about DuckDB . This post is about a particular challenge...