This is the mail archive of the ecos-devel@sourceware.org mailing list for the eCos project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: NAND technical review

From: Jonathan Larmour <jifl at jifvik dot org>
To: Ross Younger <wry at ecoscentric dot com>
Cc: ecos-devel at ecos dot sourceware dot org
Date: Tue, 13 Oct 2009 03:21:06 +0100
Subject: Re: NAND technical review
References: <4AC6218C.20407@jifvik.org> <4ACB4B58.2040804@ecoscentric.com> <4ACC0722.9020601@jifvik.org> <4ACDF868.7050706@ecoscentric.com> <4ACEF3D1.1090609@ecoscentric.com>

[ Lots of snippage throughout - assume "ack" or comprehension]
Ross Younger wrote:

Jonathan Larmour wrote:
 > > Personally I would expect use as an interrupt line as the main role of
 > > the ready line.
IMLE the overhead of sleeping and context switching is quite significant. In the drivers I've written to date, where there is a possiblity to use the ready line as an interrupt source I have provided this as an option in CDL.

For reads polling is good sure, for programs interrupts are probably better, for erases interrupts will almost certainly be better. I note that that's what you arrange for interrupt mode on the EA2468 port example, which is good.

But I digress, as this isn't something specific to your implementation.

 > > What problems would you see, if any, using your layer with the same
 > > controller and two completely different chips, of different geometry?
 > > Can you still have a common codebase with other (different) platforms?

I don't see any issue: controllers don't IME care about the chip geometry,
they just take care of the electrical side, and some calculate ECC in
passing. For that matter I don't see an issue with a single controller on
one board driving two chips of different geometries at once.

Hmm, I guess the key thing here is that in E's implementation most of the complexity has been pushed into the lower layers; at least compared to R's. R's has a more consistent interface through the layers. Albeit at the expense of some rigidity and noticeable function overhead.

It's not likely E's will be able to easily share controller code, given of course you don't know what chips, and so what chip driver APIs they'll be connected to. But OTOH, maybe this isn't a big deal since a lot of the controller-specific munging is likely to be platform-specific anyway due to characteristics of the attached NAND (e.g. timings etc.) and the only bits that would be sensibly shared would potentially happen in the processor HAL anyway at startup time. What's left may not be that much and isn't a problem in the platform HAL. However the likely exception to that is hardware-assisted ECC. A semi-formal API for that would be desirable.

>> >> 2. Application interface -----------------------------------------------

>> >> The basic operations required are reading a page, programming a page and >> >> erasing a block, and both layers provide these. > > > > However I believe Rutger's supports partial page writes (use of > > 'column'), whereas I don't believe eCosCentric's does.
As covered in the other subthread, is this actually useful, and how to sort
out the ECC?

Read back the whole page (which is a drop in the ocean compared to the time to do a full page program of course). memcmp the partially written section for validity, then regenerate the ECC. Unless the partial write was most of the page anyway (and a heuristic could deal with that), you should still end up ahead.

Alternatively, some people may not want or need ECC. Higher layers may be able to deal or have their own checking. Or the write patterns could be sufficiently infrequent that it's not an issue worth solving (e.g. firmware upgrades). In some cases you may not use ECC in one part managed by e.g. a simple boot loader which you want to keep small; and then in a different region on the same NAND there's a filesystem which does exploit ECCs.

>> >> Rutger's layer has an extra hook in >> >> place where an application may explicitly request the use of cached reading >> >> and writing where the device supports this. > > > > That seems like a useful potential optimisation, exploiting underlying > > capabilities. Any reason you didn't implement this? > > > > I could also believe that NAND controllers can also optimise by doing > > multiple block reads, where this hint would also prove useful.
Not particularly. Looking at cache-assisted read and program operations for
multi-page operations is sitting on my TODO list, languishing :-). I would
note in passing that YAFFS doesn't make use of these, preferring only to
read and write single pages fully synchronously; this might be a worthwhile
 enhancement in dealing with larger files, though YAFFS's own internal NAND
interface is strictly page-oriented at the moment and so this would require
a bit of brain surgery - something best done in conjunction with Charles
Manning, I think.

Looking to the future and things like http://osdir.com/ml/linux.file-systems.yaffs/2008-09/msg00010.html this may well change in future.

Plus contiguous reads are more likely to be useful in other NAND using applications than a general-purpose FS. Contiguous writes admittedly would be less useful to exploit, but if you can have the facility for reads you may as well have the writes.

> > Does your implementation _require_ a BBT in its current implementation? > > For simpler NAND usage, it may be overkill e.g. an application where the > > number of rewrites is very small, so the factory bad markers may be > > considered sufficient.

I suppose it would be possible to provide a CDL option to switch the persistent BBT off if you really wanted to. Caution is required, though: after you have ever written to the chip, it can be impossible to distinguish a genuine factory-bad marker from application data in the OOB area that happens to resemble it. This can be worked around with very careful management of what the application puts into the OOB or by tweaking the OOB layout to simply avoid ever writing to the relevant byte(s).

Oh I'm sure that most people will use a BBT if they can, but for simple booting applications it may be overkill and the management has a penalty. Factory markers and use of the OOB in appropriate ways can avoid the need for a BBT for simple applications e.g. by relying only on ECCs, or its own "this verified ok" marker in the OOB area.

>> >> (a) Partitions > > [snip] >> >> R's interface does not have such a facility. It appears that, in the >> >> event >> >> that the flash is shared between two or more logical regions, it's up to >> >> higher-level code to be configured with the correct block ranges to use. > > > > In yours, the block ranges must be configured in CDL. Is there much > > difference? I can see an advantage in writing platform-independent test > > programs. But in applications within products possibly less so.

I provide CDL for manual config, but have included a partition layout initialisation hook. If there was an on-chip partition table, all that's needed would be some code to go into that hook to interrogate it and translate to my layer's in-memory layout. This is admittedly not well documented, but hinted at by "Planning a port" (http://www.ecoscentric.com/ecospro/doc.cgi/html/ecospro-ref/nand-devs-writing.html)

and should be readily apparent on examining code for existing chip drivers.

Ok, that sounds like quite a good thing. It also sounds harder for R's to play nicely with Linux.

 > > Especially since the flash geometry, including size, can be
 > > programmatically queried.
Flash geometry can only be programmatically queried up to a point in non-ONFI chips. Look at the k9_devinit function in k9fxx08x08.inl: while the ReadID response of Samsung chips encodes the page, block and spare area sizes, it doesn't tell you about the chip block count or overall size - you have to know based on the device identifier byte. Linux, for example, has a big table of these in drivers/mtd/nand/nand_ids.c.

Ahh, ok.

> > If there was to be a single firmware supporting multiple board > > revisions/configurations (as can definitely happen), which could include > > different sizes of NAND, I think R's implementation would be able to > > adapt better than E's, as the high-level program can divide up the sizes > > based on what it sees.
I see no reason why E's wouldn't adapt just as well, given suitably written
driver(s) and init hooks.

Ok. I also see both your chip drivers possess these hooks - which is good as people will tend to use existing drivers as templates rather than write their own from scratch.

> > In fact, because of the requirement for the > > drivers to call CYG_NAND_FUNS, it doesn't seem difficult at all to be > > backwardly compatible. Am I right? Nevertheless, it would be unfortunate > > to have an API which already needs its low level driver interface > > updating to a rev 2.
Adding hardware ECC support and making the driver interface
backwards-compatible turned out to break layering, so I chose to change the
interface.
It's a relatively straightforward change in that I have broken up page read and program operations into three: initialise, to read/write a stride of data (length chosen by the NAND layer to mesh with whatever ECC length is provided by the controller), and finalise. The flow inside my NAND layer for programming a page becomes:

* Call chip driver to initialise the write (we expect it to send the command and address) * For each ECC-sized stride of data: ** If hardware ECC, call the ECC driver to tell it we're about to start a stride ** Call chip driver to write a stride of data ** If hardware ECC, call the ECC driver to get the ECC for the stride now completed and stash it away

* If software ECC, compute it for the page * Finalise the spare layout using the ECC wherever it came from * Call chip driver to finalise the write, passing the final spare layout (we expect it to write the spare area and send the program-confirm command).

NB Some hardware ECC's will only compute for the whole page, e.g. AT91SAM9's.

I am not yet finished this work, but will update all my existing drivers when it is done. In a way, the drawn-out nature of this process has provided extra time for my state of the art to evolve ;-)

Well that's fair enough. I think it's fair to make allowances for work that's actually under active development (rather than vapourware or just promises). Especially since you say further down your mail that it is likely to be done in the next couple of weeks. (I'm not asking you for a concrete commitment - as with anything involving volunteer effort).

 > > Incidentally I note Rutger has a "Samsung" ECC implementation, whereas
 > > you support Samsung K9 chips, but use the normal ECC algorithm. Did
 > > Samsung change their practice?
The "Samsung" ECC implementation has nothing to do with the underlying chip; it's just an algorithm whose details they published,

Indeed, but I sort of expected them to be using it in that context :).

I think in conjunction with some of the higher-level NAND-based products they ship which feature an FTL (USB sticks, SD cards, etc). There is in general no requirement to use any particular ECC algorithm with any particular chip; all the spec sheets tend to say is "use ECC".

Sure. But I was anticipating it may be industry practice, e.g. if Linux-MTD does the same. Maybe due to...

* io_nand_ecc_samsung.c provides a Samsung algorithm of the same parameters, used by the BlackFin board driver.

...it is indeed industry practice, but perhaps only rarely.

Jifl
--
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

Follow-Ups:
- Re: NAND technical review
  - From: Rutger Hofman

References:
- NAND technical review
  - From: Jonathan Larmour
- Re: NAND technical review
  - From: Ross Younger
- Re: NAND technical review
  - From: Jonathan Larmour
- Re: NAND technical review
  - From: Ross Younger

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]