WindowsXP-SP1/base/fs/ntfs/specnot2.txt
2020-09-30 16:53:49 +02:00

244 lines
9.7 KiB
Plaintext

Here is a list of all NTFS design issues which have come up that effect the
structure, along with current resolution (if there is one) of the issue. The
resolution of these issues affects the "NTFS Design Specification 1.1" issued
May 29, 1991. This list will be the final qualification to the spec until
there is time to update it to a form which reflects the actual implementation.
Of course the most precise definition of NTFS will always be in the header
file which describes its structure: ntfs.h.
These issues have been collected primarily from our own internal review and
the feedback received from MarkZ. They are listed here in no particular
order.
Issue 1:
Support for nontagged attributes is a pain in the low-level attribute
routines, as well as in Format and ChkDsk. They are of very little
value to the File System in terms of space or performance.
Resolution:
Nontagged attributes are being dropped for the purposes of NTFS's own
use of attributes to implement the disk structure. Nontagged attributes
will be supported with the general table support.
Issue 2:
The EXTERNAL_ATTRIBUTES attribute, should have a better name, and its
definition should be changed to simplify various NTFS algorithms.
Resolution:
The attribute name has been changed to the ATTRIBUTE_LIST attribute.
It is still only created when a file requires more than one file record
segment. At that time it is created to list all attributes (including
those in the base file record) by type code and (optional) name. it is
ordered by Attribute Type Code and Attribute Name.
One reason for this change is to facilitate the enumeration of all
attributes for a file with multiple file record segments. This
slightly different definition also gives NTFS's attribute placement
policy more freedom to shuffle attributes around within the file
record segments.
Issue 3:
Attribute ordering rules within the file, within each file record segment,
and within the ATTRIBUTE_LIST were not completely specified.
Resolution:
The only rule for the ordering of attributes within each file, if there
are multiple file record segments, is that STANDARD_INFORMATION must be
in the base file record segment, and (at least the first part of) the
ATTRIBUTE_LIST attribute must also be in the base file record segment.
In general, the system should try to keep the other system-defined
attributes with the lowest Attribute Type Codes present in the base file
record segment when possible, for performance reasons.
Within each file record segment, attributes will be ordered by type code,
name, and then value. (If an attribute is not unique in type code and
name, then it must be indexed and the value must be referenced.)
The entries of the ATTRIBUTE_LIST will be ordered by attribute code and
name.
Reliance on these ordering rules may be used to speed up attribute lookup
algorithms.
Issue 4:
NTFS is NOT secure on removeable media without data encryption.
Resolution:
Functionality for the encryption of communications and physical media
is already planned for Product 2 of NT, at which time we will decide
what the best mechanism will be for integrating this support with
removeable NTFS volumes. We must insure now that this can be implemented
in a upward-compatible manner.
Issue 5:
It would be very desirable for WINX to have the ability to uniquely
identify and open files by a small number.
Resolution:
Logically the ability to use this functionality must be controlled by
some privilege, as it is expensive and nearly impossible to come up with a
consistent strategy on how to do correct path traversal checking, in a
system such as NTFS which supports multiple directory links to a single
file. Once the requirement for a special privilege is accepted, it is
relatively easy for NTFS to support an API which would allow files to
be opened by their (64-bit) File Reference number. The File Reference
is perfect for this purpose, as it includes a 16-bit cyclically-reused
sequence number to detect the attempt to use a stale File Reference.
I.e., the original file with the same 48-bit Base File Record address has
been deleted, and a new file has been created at the same address.)
THIS REQUIRES A NEW NT I/O API.
Issue 6:
Enumeration of files in a directory in NT could be very slow, since
to get more than just a file's name requires reading (at least) the
base file record for the file.
Resolution:
The initial NT-based implementation of NTFS will come up with a
strategy for clustering file record segments together in the MFT for
files created in the same directory. Current thinking is that this
will be done *without* change to the NTFS structure definition. So,
for example, the first 128 files in a directory might be contiguous in
the MFT, and then the second 128 will also be contiguous, etc. This
will allow the implementation to prefetch files up to 128 file record
segments at a time with a large spiral read, then expect cache hits during
the enumeration.
Secondly, at some point the implementation will cache enumeration
information, to make subsequent enumeration of the same directory
extremely fast.
Issue 7:
Is it an unnecessary complexity to NTFS to support multiple collating
rules, as opposed to a simple byte-comparison collation? Note that
frequently the caller collates himself anyway.
Resolution:
This is not resolved yet pending further discussion.
The current reason NTFS plans to support multiple collating rules,
is that collating in the caller can have bad performance and response
characteristics in large directories. For example, consider a Windows
App which requests the enumeration of a directory with 200 files (possibly
over the network to a heavily loaded server), and it is going to
display this enumeration in a List box with 10 or 20 lines. If it
does not have to collate the enumeration, it can start displaying
as soon as it receives part of the enumeration. Otherwise it has
to wait to get the entire enumeration before it can collate and display
anything.
Issue 8:
Should there be a bit in STANDARD_INFORMATION to indicate whether a
file record has an INDEX attribute or not?
Resolution:
There is no plan to do this, unless we find additional reasons
to do so that we are missing. Currently we see how this bit could
speed the rejection of illegal path specifications, but it would
not speed the acceptance of correct ones. Note that from the structure
of NTFS, it is legal for a file to have both an INDEX attribute *and*,
for example, a DATA attribute.
Issue 9:
The algorithms and consistency rules surrounding the 8.3 indices need to
be clarified.
Resolution:
This will be done by 7/31.
Issue 10:
Why not eliminate the VERSION attribute and move it to
STANDARD_INFORMATION?
Resolution:
We will do this, and then define an additional file attribute
and/or field which controls whether or not versioning is enabled and
possibly how many versions are allowed for a file.
Issue 11:
There should be a range of system-defined attribute codes which are
not allowed to be duplicated, as this will speed up some of the
lookup algorithms.
Resolution:
This will be done.
Issue 12:
Is duplication of the log file the correct way to add redundancy to
NTFS to allow mounting in the event of read errors.
Resolution:
Upon further analysis, it was determined that the needed redundancy
was incorrectly placed. It is more important to duplicate the first
few entries of the MFT, than to duplicate the start of the log file.
This change will be made.
Issue 13:
The spec describes how access to individual attribute types may be
controlled by special ACEs, which is incompatible with the current
NT APIs and our security strategy.
Resolution:
This will be fixed. Access to user-defined attributes will be controlled
by the READ_ATTRIBUTES and WRITE_ATTRIBUTES access rights.
Issue 14:
A file attribute should be added which supports more efficient handling
of temporary files.
Resolution:
An attribute will be added for files, and possibly directories, which
will enable NTFS to communicate "temporary file" handling to the Cache
Manager. Temporary files will never be set dirty in the Cache Manager
or written to disk by the Lazy Writer, although the File Record will
be correctly updated to keep the volume consistant. If a temporary file
is deleted, then all writes to its data are eliminated. If MM discovers
that memory is getting tight, it may choose to flush data to temporary
files, so that it can free the pages. In this case the
correct data for the file will eventually be faulted back in.
This makes the performance of I/O to temporary files approach the
performance of putting them on a RAM disk. An advantage over RAM disk,
though, is that no one has to specify how much space should be used
for this purpose.
Issue 15:
It would be nice to have some flag in each file record segment to say
if it is in use or not. This would simplify chkdsk algorithms, although
it would require the record to be written on deletion.
Resolution:
This will be done. It is difficult to suppress the write of the file
record on deletion anyway.