On 2016-09-07 20:18 -0600, chadvellacott@sasktel.net wrote:
I have been reading on "MLC NAnd", and it seems that now I better
understand the problem of corruption-on-read.
I have some experience of NAND due to working on YAFFS a few years ago. My info may be slightly out of date as NAND has got even fatter in the last few years.
(a) How common is it for corruption-on-read to occur with "MLC NAnd" (like by something as basic as reads done by the built-in "ROM" "boot"-loader)? (Perhaps the answer is like "N % probability that one or more of the [1 to 4] bits in a cell, shall wrongly change it's logical value, after X reads of one or more pages in the same block, Y writes to one or more pages in the same block, and Z erases of the block".) At first I was thinking that the FIRST time data (mini "boot"-loader or otherwise) is read from the "NAnd", corruption might likely occur. But perhaps this corruption-on-read naturally happens ONLY after many reads or writes in a block and many erases of a block. So how common is it?
'Rare'. Corruption-on-read (called 'read disturb' in the literature) will only happen after there have been quite a few reads aligned on just the 'wrong' page. 20,000 or so might do it in modern MLC. write-disturb is much more likely than read-disturb. So each read that energises the same 'row' in the flash layout increases the error potential a tiny amount. Each write increases it quite a lot more (but in modern flash with a flash-aware filesystem you only ever write a page once before erasing it so this is not an issue).
Bits are 'refreshed' (and the probabilities of error reset) when the bits in question are rewritten. So a really smart flash filesystem will ensure that 'old' data that is near pages that have been read a lot, gets moved.
(b) Would the "MLC NAnd" planned in the computer-cards via "Crowd Supply", have Error-Correction Codes?
yes. All NAND-flash has this otherwise it would be uselessly unreliable.
(c) If so, then does whatever reads the "NAnd" on "booting" (I guess it is called the "eGON boot-ROM"), know that it should (and know how to) use those "ECCs", to correct errors (if any) which it encounters when trying to start the "booting" process, so that it loads the correct original bits of the "boot"-loader ("minimalist" or otherwise)?
yes. all NAND reads check the ECC.
The YAFFS site has a load of info on the issue of NAND (un)reliability, and what it does to manage/mitigate it: http://yaffs.net/documents/yaffs-nand-flash-failure-mitigation Specifically: http://yaffs.net/documents/yaffs-nand-flash-failure-mitigation#Read_disturb
HTH
Wookey