NFSv4 T. Haynes Internet-Draft Primary Data Intended status: Standards Track August 07, 2017 Expires: February 8, 2018 Parallel NFS (pNFS) Flexible File Layout v2 draft-haynes-nfsv4-flex-filesv2-00.txt Abstract The Parallel Network File System (pNFS) allows a separation between the metadata (onto a metadata server) and data (onto a storage device) for a file. The flexible file layout type is an extension to pNFS which allows the use of storage devices in a fashion such that they require only a quite limited degree of interaction with the metadata server, using already existing protocols. This document describes two extensions to the flexible file layout type to allow for multiple stateids for tightly coupled NFSv4 models and an additional security mechanism for loosely coupled models. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on February 8, 2018. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect Haynes Expires February 8, 2018 [Page 1] Internet-Draft Flex File Layout v2 August 2017 to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 2. XDR Description of the Flexible File Layout Type . . . . . . 4 2.1. Code Components Licensing Notice . . . . . . . . . . . . 5 3. Flexible File Layout Type v2 . . . . . . . . . . . . . . . . 6 3.1. ffv2_layout4 . . . . . . . . . . . . . . . . . . . . . . 7 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 4.1. RPCSEC_GSS and Security Services . . . . . . . . . . . . 9 4.1.1. Loosely Coupled . . . . . . . . . . . . . . . . . . . 9 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 10 6.1. Normative References . . . . . . . . . . . . . . . . . . 10 6.2. Informative References . . . . . . . . . . . . . . . . . 11 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . 11 Appendix B. RFC Editor Notes . . . . . . . . . . . . . . . . . . 11 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 11 1. Introduction In the parallel Network File System (pNFS), the metadata server returns layout type structures that describe where file data is located. There are different layout types for different storage systems and methods of arranging data on storage devices. [flexfiles] defines the flexible file layout type used with file- based data servers that are accessed using the Network File System (NFS) protocols: NFSv3 [RFC1813], NFSv4.0 [RFC7530], NFSv4.1 [RFC5661], and NFSv4.2 [RFC7862]. The first version of the flexible file layout type had two issues which could not be addressed in [flexfiles] because of existing implementations. The first issue was that under the tightly coupled model for a NFSv4 implementation, either a global stateid or an anonymous stateid needed to be used. The second issue was that under the loosely coupled model, for a secure Remote Procedural Call (RPC) ([RFC5531]) implementation, each of the client, metadata server, and storage devices needed to implement an RPC-application-defined structured privilege assertion with RPCSEC_GSS version 3 (RPCSEC_GSSv3) [RFC7861]. The second version of the flexible file layout type addresses both of these issues. Haynes Expires February 8, 2018 [Page 2] Internet-Draft Flex File Layout v2 August 2017 1.1. Definitions control communication requirements: defines for a layout type the details regarding information on layouts, stateids, file metadata, and file data which must be communicated between the metadata server and the storage devices. control protocol: defines a particular mechanism that an implementation of a layout type would use to meet the control communication requirement for that layout type. This need not be a protocol as normally understood. In some cases the same protocol may be used as a control protocol and data access protocol. data file: is that part of the file system object which contains the content. fencing: is when the metadata server prevents the storage devices from processing I/O from a specific client to a specific file. file layout type: is a layout type in which the storage devices are accessed via the NFS protocol (see Section 13 of [RFC5661]). layout: informs a client of which storage devices it needs to communicate with (and over which protocol) to perform I/O on a file. The layout might also provide some hints about how the storage is physically organized. layout iomode: describes whether the layout granted to the client is for read or read/write I/O. layout stateid: is a 128-bit quantity returned by a server that uniquely defines the layout state provided by the server for a specific layout that describes a layout type and file (see Section 12.5.2 of [RFC5661]). Further, Section 12.5.3 of [RFC5661] describes the difference between a layout stateid and a normal stateid. layout type: describes both the storage protocol used to access the data and the aggregation scheme used to lay out the file data on the underlying storage devices. loose coupling: is when the metadata server and the storage devices do not have a control protocol present. metadata file: is that part of the file system object which describes the object and not the content. E.g., it could be the time since last modification, access, etc. Haynes Expires February 8, 2018 [Page 3] Internet-Draft Flex File Layout v2 August 2017 metadata server (MDS): is the pNFS server which provides metadata information for a file system object. It also is responsible for generating layouts for file system objects. Note that the MDS is responsible for directory-based operations. recalling a layout: is when the metadata server uses a back channel to inform the client that the layout is to be returned in a graceful manner. Note that the client has the opportunity to flush any writes, etc., before replying to the metadata server. revoking a layout: is when the metadata server invalidates the layout such that neither the metadata server nor any storage device will accept any access from the client with that layout. stateid: is a 128-bit quantity returned by a server that uniquely defines the open and locking states provided by the server for a specific open-owner or lock-owner/open-owner pair for a specific file and type of lock. storage device: designates the target to which clients may direct I/ O requests when they hold an appropriate layout. See Section 2.1 of [pNFSLayouts] for further discussion of the difference between a data store and a storage device. tight coupling: is when the metadata server and the storage devices do have a control protocol present. 1.2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. XDR Description of the Flexible File Layout Type This document contains the external data representation (XDR) [RFC4506] description of the flexible file layout type version 2. The XDR description is embedded in this document in a way that makes it simple for the reader to extract into a ready-to-compile form. The reader can feed this document into the following shell script to produce the machine readable XDR description of the flexible file layout type version 2: #!/bin/sh grep '^ *///' $* | sed 's?^ */// ??' | sed 's?^ *///$??' Haynes Expires February 8, 2018 [Page 4] Internet-Draft Flex File Layout v2 August 2017 That is, if the above script is stored in a file called "extract.sh", and this document is in a file called "spec.txt", then the reader can do: sh extract.sh < spec.txt > flex_filesv2_prot.x The effect of the script is to remove leading white space from each line, plus a sentinel sequence of "///". The embedded XDR file header follows. Subsequent XDR descriptions, with the sentinel sequence are embedded throughout the document. Note that the XDR code contained in this document depends on types from both the flex files version 1 flex_filesv2_prot.x file ([flexfiles]) and the NFSv4.1 nfs4_prot.x file ([RFC5662]). This includes both nfs types that end with a 4, such as offset4, length4, etc., as well as more generic types such as uint32_t and uint64_t. 2.1. Code Components Licensing Notice Both the XDR description and the scripts used for extracting the XDR description are Code Components as described in Section 4 of "Legal Provisions Relating to IETF Documents" [LEGAL]. These Code Components are licensed according to the terms of that document. /// /* /// * Copyright (c) 2012 IETF Trust and the persons identified /// * as authors of the code. All rights reserved. /// * /// * Redistribution and use in source and binary forms, with /// * or without modification, are permitted provided that the /// * following conditions are met: /// * /// * o Redistributions of source code must retain the above /// * copyright notice, this list of conditions and the /// * following disclaimer. /// * /// * o Redistributions in binary form must reproduce the above /// * copyright notice, this list of conditions and the /// * following disclaimer in the documentation and/or other /// * materials provided with the distribution. /// * /// * o Neither the name of Internet Society, IETF or IETF /// * Trust, nor the names of specific contributors, may be Haynes Expires February 8, 2018 [Page 5] Internet-Draft Flex File Layout v2 August 2017 /// * used to endorse or promote products derived from this /// * software without specific prior written permission. /// * /// * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS /// * AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED /// * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE /// * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS /// * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO /// * EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE /// * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, /// * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT /// * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR /// * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS /// * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF /// * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, /// * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING /// * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF /// * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. /// * /// * This code was derived from RFCTBD10. /// * Please reproduce this note if possible. /// */ /// /// /* /// * flex_files_prot.x /// */ /// /// /* /// * The following include statements are for example only. /// * The actual XDR definition files are generated separately /// * and independently and are likely to have a different name. /// * %#include /// * %#include /// */ /// 3. Flexible File Layout Type v2 This document defines structures associated with the layouttype4 value LAYOUT4_FLEX_FILES_V2 and it presents the minimal XDR changes neccessary from LAYOUT4_FLEX_FILES, which is described in [flexfiles]. [RFC5661] specifies the loc_body structure as an XDR type "opaque". The opaque layout is uninterpreted by the generic pNFS client layers, but is interpreted by the flexible file layout type implementation. This section defines the structure of this otherwise opaque value, ffv2_layout4. Haynes Expires February 8, 2018 [Page 6] Internet-Draft Flex File Layout v2 August 2017 3.1. ffv2_layout4 /// struct ffv2_data_server4 { /// deviceid4 ffds_deviceid; /// uint32_t ffds_efficiency; /// stateid4 ffds_stateid<>; /// nfs_fh4 ffds_fh_vers<>; /// fattr4_owner ffds_user; /// fattr4_owner_group ffds_group; /// opaque_auth ffds_auth; /// }; /// /// struct ffv2_mirror4 { /// ffv2_data_server4 ffm_data_servers<>; /// }; /// /// struct ffv2_layout4 { /// length4 ffl_stripe_unit; /// ffv2_mirror4 ffl_mirrors<>; /// ff_flags4 ffl_flags; /// uint32_t ffl_stats_collect_hint; /// }; /// The ffv2_layout4 structure specifies a layout over a set of mirrored copies of that portion of the data file described in the current layout segment. It is possible that the file is concatenated from more than one layout segment. Each layout segment MAY represent different striping parameters, applying respectively only to the layout segment byte range. The ffl_stripe_unit field is the stripe unit size in use for the current layout segment. The number of stripes is given inside each mirror by the number of elements in ffm_data_servers. If the number of stripes is one, then the value for ffl_stripe_unit MUST default to zero. The only supported mapping scheme is sparse and is detailed in Section 6 of [flexfiles]. Note that there is an assumption here that both the stripe unit size and the number of stripes is the same across all mirrors. Haynes Expires February 8, 2018 [Page 7] Internet-Draft Flex File Layout v2 August 2017 The ffl_mirrors field is the array of mirrored storage devices which provide the storage for the current stripe, see Figure 1. +-----------+ | | | | | File | | | | | +-----+-----+ | +------------+------------+ | | +----+-----+ +-----+----+ | Mirror 1 | | Mirror 2 | +----+-----+ +-----+----+ | | +-----------+ +-----------+ |+-----------+ |+-----------+ ||+-----------+ ||+-----------+ +|| Storage | +|| Storage | +| Devices | +| Devices | +-----------+ +-----------+ Figure 1 The ffs_mirrors field represents an array of state information for each mirrored copy of the current layout segment. Each element is described by a ffv2_mirror4 type. ffds_deviceid provides the deviceid of the storage device holding the data file. ffds_fh_vers is an array of filehandles of the data file matching to the available NFS versions on the given storage device. There MUST be exactly as many elements in ffds_fh_vers as there are in both ffda_versions (see 4.1 of [flexfiles]) and ffds_stateid. Each element of the array corresponds to a particular combination of ffdv_version, ffdv_minorversion, and ffdv_tightly_coupled provided for the device. The array allows for server implementations which have different filehandles for different combinations of version, minor version, and coupling strength. See Section 5.3 of [flexfiles] for how to handle versioning issues between the client and storage devices. For tight coupling, ffds_stateid provides the stateids to be used by the client to access the file. For loose coupling and a NFSv4 storage device, the client may use anonymous stateids to perform I/O Haynes Expires February 8, 2018 [Page 8] Internet-Draft Flex File Layout v2 August 2017 on the storage device as there is no use for the metadata server stateid (no control protocol). In such a scenario, the server MUST set the ffds_stateids to be anonymous stateids. For loose coupling, ffds_auth provides the RPC credentials needed for secure access to the storage devices. If secure access is not needed, i.e., the synthetic ids are sufficient, or in a tight coupling, the server should use the AUTH_NONE flavor and a zero length opaque body to minimize the returned structure length. [[AI1: after the lesson learned from ffds_stateid, we either need to put an array here or define all of the file handles to share the same credentials. And as Olga points out in her email, this gets big fast. Especially if we throw in many mirrored copies! --TH]] 4. Security Considerations All of the security considerations to [flexfiles] apply here. In addition, this document addresses how security mechanisms, such as Kerberos V5 GSS-API [RFC4121], can be applied to the loosely coupled model. 4.1. RPCSEC_GSS and Security Services 4.1.1. Loosely Coupled Under this coupling model, the principal used to authenticate the metadata file is different than that used to authenticate the data file. For the metadata server, the RPC credentials would be generated by the same source as the client. For RPC credentials to the data on the storage device, the metadata server would be responsible for their generation. Such "credentials" SHOULD be limited to just the data file be accessed. Using Kerberos V5 GSS-API [RFC4121], some possible approaches would be: o a dedicated/throwaway client principal name akin to the synthetic uid/gid schemes. o authorization data in the ticket. o an out-of-band scheme between the client and metadata server. Depending on the implementation details, fencing would then be controlled either by expiring the credential or by modifying the synthetic uid or gid on the data file. I.e., if the credentials are at a finer granularity than the synthetic ids, it might be possible to also fence just one client from the file. Haynes Expires February 8, 2018 [Page 9] Internet-Draft Flex File Layout v2 August 2017 5. IANA Considerations [RFC5661] introduced a registry for "pNFS Layout Types Registry" and as such, new layout type numbers need to be assigned by IANA. This document defines the protocol associated with the existing layout type number, LAYOUT4_FLEX_FILES_V2 (see Table 1). +-----------------------+-------+----------+-----+----------------+ | Layout Type Name | Value | RFC | How | Minor Versions | +-----------------------+-------+----------+-----+----------------+ | LAYOUT4_FLEX_FILES_V2 | 0x6 | RFCTBD10 | L | 1 | +-----------------------+-------+----------+-----+----------------+ Table 1: Layout Type Assignments 6. References 6.1. Normative References [LEGAL] IETF Trust, "Legal Provisions Relating to IETF Documents", November 2008, . [RFC1813] IETF, "NFS Version 3 Protocol Specification", RFC 1813, June 1995. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4121] Zhu, L., Jaganathan, K., and S. Hartman, "The Kerberos Version 5 Generic Security Service Application Program Interface (GSS-API) Mechanism Version 2", RFC 4121, July 2005. [RFC4506] Eisler, M., "XDR: External Data Representation Standard", STD 67, RFC 4506, May 2006. [RFC5531] Thurlow, R., "RPC: Remote Procedure Call Protocol Specification Version 2", RFC 5531, May 2009. [RFC5661] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network File System (NFS) Version 4 Minor Version 1 Protocol", RFC 5661, January 2010. [RFC5662] Shepler, S., Ed., Eisler, M., Ed., and D. Noveck, Ed., "Network File System (NFS) Version 4 Minor Version 1 External Data Representation Standard (XDR) Description", RFC 5662, January 2010. Haynes Expires February 8, 2018 [Page 10] Internet-Draft Flex File Layout v2 August 2017 [RFC7530] Haynes, T. and D. Noveck, "Network File System (NFS) version 4 Protocol", RFC 7530, March 2015. [RFC7862] Haynes, T., "NFS Version 4 Minor Version 2", RFC 7862, November 2016. [flexfiles] Halevy, B. and T. Haynes, "Parallel NFS (pNFS) Flexible File Layout", draft-ietf-nfsv4-flex-files-13 (Work In Progress), July 2017. [pNFSLayouts] Haynes, T., "Requirements for pNFS Layout Types", draft- ietf-nfsv4-layout-types-05 (Work In Progress), July 2017. 6.2. Informative References [RFC7861] Adamson, W. and N. Williams, "Remote Procedure Call (RPC) Security Version 3", November 2016. Appendix A. Acknowledgments Dave Noveck inspired the need for mutiple stateids for the tightly coupled model in [flexfiles]. Olga Kornievskaia inspired the need for another security mechanism for the loosely coupled model in [flexfiles]. Appendix B. RFC Editor Notes [RFC Editor: please remove this section prior to publishing this document as an RFC] [RFC Editor: prior to publishing this document as an RFC, please replace all occurrences of RFCTBD10 with RFCxxxx where xxxx is the RFC number of this document] Author's Address Thomas Haynes Primary Data, Inc. 4300 El Camino Real Ste 100 Los Altos, CA 94022 USA Phone: +1 408 215 1519 Email: thomas.haynes@primarydata.com Haynes Expires February 8, 2018 [Page 11]