Talk:InterSystems Caché

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Software (assessed as Low-importance).

From VfD:

Advert? Notable? Dunc_Harris|☺ 14:13, 14 Jul 2004 (UTC)

Keep. The article is pretty decent and does not read as an advert. And the Caché system is definitely a productive system. --Alexandre 23:32, 14 Jul 2004 (UTC)
Keep. I've done a little work on the article to reduce promotional POV. It could use a little more, but it is a reasonably notable product from a reasonably notable company, and currently the most important application of the MUMPS computer language (although InterSystems, for marketing reasons, detests the name MUMPS and tries to avoid any reference to it...) Dpbsmith 00:54, 16 Jul 2004 (UTC)

end moved dicussion

from Votes for deletion/Caché (wrong namespace):

Keep. It's just a stub. A poor stub, but still a stub. — Frecklefoot | Talk 14:52, Jul 14, 2004 (UTC)

Keep. Stub. Needs work though. The Lurker 16:46, 20 Jul 2004 (UTC)
- The above user is a sock puppet whose entire edit history is to VfD articles. Discount the vote. RickK 22:06, Jul 20, 2004 (UTC)

I tried to improve the article by using more precise terms. My corrections have been undone, though. So here are my explanations: 1) Caché is not a multidemensional database (though Intersystems claims so in their advertising): The term "multidimensional database" stems from the OLAP context where the same data can be accessed via INDEPENDENT dimensions (slicing and dicing are typical operations). The "dimensions" in Caché globals are actually hierarchy levels: you need to know the first level in order to be able to use the second level. 2) Caché globals are not b-trees! B-trees are self balancing trees. Cache globals are hierarchically structured data sets, which are structured according to data semantics.

I think a correct statement would be B-trees are used as the underlying implementation of globals in all modern versions of MUMPS. Dpbsmith (talk) 20:34, 26 October 2005 (UTC)[reply]

I do not agree: B trees are access paths which are used to speed up access to data via an index. In database theory access paths are clearly separated from the data model: It should be possible to add or remove an index without having to modify application programs that use the data. B-trees (and other kinds of access paths) are used in relational database systems but noone would say B-trees are used as underlying implementation of tables, because the tables are still there when the index is removed. The same holds for cache globals: Globals are not B-trees - and even if B-trees are used to speed up access to globals, this would be an additional data structure, which is not seen by an application programmer.

I would highly recommend you to read original documentation before jumping into any assumption about internal database structures. In Cache' (unlike Oracle, MS-SQL and others) the only possible internal data structure is Global and global is a B tree based.

"Globals are stored on disk within a series of data blocks; the size of each block (typically 8KB) is determined when the physical database is created. To provide efficient access to data, Caché maintains a sophisticated B-tree-like structure that uses a set of pointer blocks to link together related data blocks. Caché maintains a buffer pool — an in-memory cache of frequently referenced blocks — to reduce the cost of fetching blocks from disk."

See http://platinum.intersystems.com/csp/docbook/DocBook.UI.Page.cls?KEY=GGBL_structure#GGBL_C10896 for details.

aou | Talk

Well, I would highly recommend to read solid database literature instead of commercial advertising. However, thank for your link! Now, please read the text you cited carefully: "Globals are stored on disk within a series of data blocks; the size of each block (typically 8KB) is determined when the physical database is created. To provide efficient access to data, Caché maintains a sophisticated B-tree-like structure that uses a set of pointer blocks to link together related data blocks...."

The "B-tree like structure" is an access path which is added to the sequence of data blocks that contain the actual data in order to provide efficient access!

Isn't this almost exactly what I said? "B-trees are used as the underlying implementation of globals in all modern versions of MUMPS."

No! This is not what you said. I explained this in my comment which has been deleted ... (Maybe due to synchronization errors) So again: "Globals are STORED on disk WITHIN a series of data blocks". In other words: A series of data blocks is used as a container to store the global. The global itself is hierarchically structured according to data semantics. The "B-tree like structure" is used to link the blocks that CONTAIN the global. Both B-trees and globals are hierarchically structured, but the structure of the B-tree has nothing to do with the structure of the global! If you say that "B-trees are used as implementation OF globals" you mix up different storage levels. A global and its hierarchical structure is seen and used by an application programmer, but the B-tree is (hopefully) not. A B-tree is always balanced, a global is usually not balanced. By the way - the text does not explain what a "B-tree LIKE" structure actually is, so stating that B-trees are used is not correct.

Just as C (and C++) deliberately sacrifice a certain amount of safety and generality in favor of efficiency, so do database programs written in MUMPS.

It is possible to say objectively that, comparing Java and C++, Java trades off efficiency in favor of safety, generality, and better adherence to object-oriented methodology. It is not possible to say objectively that Java is always a better choice than C++ (or vice versa). The same can be said of MUMPS and relational databases. Dpbsmith (talk) 19:30, 2 November 2005 (UTC)[reply]

I completely agree - but this was never questioned! I only advocate for using precise database terminology. My experience is that Intersystems tries to avoid "bad terms" such as "hierarchical database". "B-tree", instead, is a "good term", so it is used in their project descriptions, even if the actual access path is only somehow similar to a B-tree. I understand this, and I even understand that the term "postrelational database" might have a marketing effect. But here we are working on an encyclopedia! We should only use scientifically sound, unambiguous, and precise explanations, as far as possible.

I disagree on this statement:

 Caché claims it is one of the fastest databases, and so is ideal for real-time applications.

It is not because a database is fast it thereby is ideal for real-time applications. Real-time means it can respond/interact in a predictable way in timing. Considering that it focuses on using caches this would mean exactly the opposite, sometimes the data will be cached and sometimes it won't, which makes it non predictable making it dreadfully bad for real-time applications.

You are thinking of "real-time" systems such as the ones used in industrial embedded systems where response occurs within a fixed timeframe so they nail the timing every time. Intersystems is using a non-technical meaning for real-time.

I don't think that the 2nd reference link source adequately substantiates that "Caché claims it is one of the fastest databases". nor matches the reference description of "InterSystems Caché – World's fastest database". I'm guessing that Intersystems has since updated that webpage to no longer include the claim of fastest.

99.135.73.44 (talk) 10:35, 25 August 2009 (UTC)[reply]